Keyword (Internet search)
Encyclopedia
An index term, subject term, subject heading, or descriptor, in information retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

, is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic record
Bibliographic record
A bibliographic record is an entry being a uniform representation and description of a specific content item in a bibliographic database , containing data elements required for its identification and retrieval, as well as additional supporting information, presented in a formalized bibliographic...

s. They are an integral part of bibliographic control, which is the function by which libraries collect, organize and disseminate documents. They are used as keywords to retrieve documents in an information system, for instance, a catalog or a search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

. A popular form of keywords on the web are tags
Tag (metadata)
In online computer systems terminology, a tag is a non-hierarchical keyword or term assigned to a piece of information . This kind of metadata helps describe an item and allows it to be found again by browsing or searching...

 which are directly visible and can be assigned by non-experts also. Index terms can consist of a word, phrase, or alphanumerical term. They are created by analyzing the document either manually with subject indexing
Subject indexing
Subject indexing is the act of describing or classifying a document by index terms or other symbols in order to indicate what the document is about, to summarize its content or to increase its findability. In other words, it is about identifying and describing the subject of documents...

 or automatically with automatic indexing
Index (search engine)
Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and computer science...

 or more sophisticated methods of keyword extraction. Index terms can either come from a controlled vocabulary
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

 or be freely assigned.

Keywords are stored in a search index
Index (search engine)
Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and computer science...

. Common words like articles
Article (grammar)
An article is a word that combines with a noun to indicate the type of reference being made by the noun. Articles specify the grammatical definiteness of the noun, in some languages extending to volume or numerical scope. The articles in the English language are the and a/an, and some...

 (a, an, the) and conjunctions (and, or, but) are not treated as keywords because it is inefficient to do so. Almost every English-language site on the Internet has the article "the", and so it makes no sense to search for it. The most popular search engine, Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 removed stop words
Stop words
In computing, stop words are words which are filtered out prior to, or after, processing of natural language data . It is controlled by human input and not automated. There is not one definite list of stop words which all tools use, if even used...

 such as "the" and "a" from its indexes for several years, but then re-introduced them, making certain types of precise search possible again.

The term "descriptor" was coined by Calvin Mooers
Calvin Mooers
Calvin Northrup Mooers , was an American computer scientist known for his work in information retrieval and for the programming language TRAC....

 in 1948. It is in particular used about a prefered term from a thesaurus
Thesaurus
A thesaurus is a reference work that lists words grouped together according to similarity of meaning , in contrast to a dictionary, which contains definitions and pronunciations...

.

The Simple Knowledge Organisation System language (SKOS) provides a way to express index terms with Resource Description Framework
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

 for use in the context of Semantic Web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

.

Author keywords

Many journals and databases provides access (also) to index terms made by authors to the articles being published or represented. The relative quality of indexer-provided index terms and author provided index terms is of interest to research in information retrieval. The quality of both kinds of indexing terms depends, of course, on the qualifications of provider. In general authors have difficulties providing indexing terms that characterizes his document relative to the other documents in the database.

Examples

  • Canadian Subject Headings
    Canadian Subject Headings
    Canadian Subject Headings is a list of subject headings in the English language, using controlled vocabulary, to access and express the topic content of documents on Canada and Canadian topics. Library and Archives Canada publishes and maintains CSH on the Web...

  • Library of Congress Subject Headings
    Library of Congress Subject Headings
    The Library of Congress Subject Headings comprise a thesaurus of subject headings, maintained by the United States Library of Congress, for use in bibliographic records...

  • Medical Subject Headings
    Medical Subject Headings
    Medical Subject Headings is a comprehensive controlled vocabulary for the purpose of indexing journal articles and books in the life sciences; it can also serve as a thesaurus that facilitates searching...

  • PSH
    Polythematic Structured Subject Heading System
    Polythematic Structured Subject Heading System is a bilingual Czech-English controlled vocabulary of subject headings developed and maintained by the National Technical Library in Prague...


See also

  • Keyword cloud
  • Keyword density
    Keyword density
    Keyword density is the percentage of times a keyword or phrase appears on a web page compared to the total number of words on the page. In the context of search engine optimization keyword density can be used as a factor in determining whether a web page is relevant to a specified keyword or...

  • Keyword optimization
  • Keyword tagging
    Knowledge tags
    A knowledge tag is a type of meta-information that describes or defines some aspect of an information resource . Knowledge tags are more than traditional non-hierarchical keywords or terms...

  • Subject (documents)
    Subject (documents)
    In library and information science documents are classified and searched by subject - as well as by other attributes such as author, genre and document type. This makes "subject" a fundamental term in this field. Library and information specialists assign subject labels to documents to make them...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK