Thesaurus
Encyclopedia
A thesaurus is a reference work
Reference work
A reference work is a compendium of information, usually of a specific type, compiled in a book for ease of reference. That is, the information is intended to be quickly found when needed. Reference works are usually referred to for particular pieces of information, rather than read beginning to end...

 that lists words grouped together according to similarity of meaning (containing synonyms and sometimes antonyms), in contrast to a dictionary
Dictionary
A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...

, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary, which contains more than 920,000 entries.

History and use of term

In antiquity, Philo of Byblos
Philo of Byblos
Philo of Byblos was an antiquarian writer of grammatical, lexical and historical works in Greek. He is chiefly known for his Phoenician history assembled from the writings of Sanchuniathon.-Life:...

 authored the first text that could now be called a thesaurus. In Sanskrit
Sanskrit
Sanskrit , is a historical Indo-Aryan language and the primary liturgical language of Hinduism, Jainism and Buddhism.Buddhism: besides Pali, see Buddhist Hybrid Sanskrit Today, it is listed as one of the 22 scheduled languages of India and is an official language of the state of Uttarakhand...

, the Amarakosha
Amarakosha
The Amarakosha from amara "immortal" and kosha "treasure, casket, pail, collection, dictionary", also Namalinganushasana from nama-linga-anu-shasana "instruction concerning nouns and gender") is a thesaurus of Sanskrit written by the Jain or Buddhist scholar Amarasimha...

 is a thesaurus in verse form, written in the 4th century. The first example of the modern genre
Genre
Genre , Greek: genos, γένος) is the term for any category of literature or other forms of art or culture, e.g. music, and in general, any type of discourse, whether written or spoken, audial or visual, based on some set of stylistic criteria. Genres are formed by conventions that change over time...

, Roget's Thesaurus
Roget's Thesaurus
Roget's Thesaurus is a widely-used English language thesaurus, created by Dr. Peter Mark Roget in 1805 and released to the public on 29 April 1852. The original edition had 15,000 words, and each new edition has been larger...

, was compiled in 1805 by Peter Mark Roget, and published in 1852. Entries in Roget's Thesaurus are listed conceptually rather than alphabetically.

Although including synonyms, a thesaurus should not be taken as a complete list of all the synonyms for a particular word. The entries are also designed for drawing distinctions between similar words and assisting in choosing exactly the right word. Unlike a dictionary
Dictionary
A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...

, a thesaurus entry does not give the definition of words.

The word "thesaurus" is derived from 16th-century New Latin
New Latin
The term New Latin, or Neo-Latin, is used to describe the Latin language used in original works created between c. 1500 and c. 1900. Among other uses, Latin during this period was employed in scholarly and scientific publications...

, in turn from Latin
Latin
Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

 thesaurus, which is the latinisation
Latinisation (literature)
Latinisation is the practice of rendering a non-Latin name in a Latin style. It is commonly met with for historical personal names, with toponyms, or for the standard binomial nomenclature of the life sciences. It goes further than Romanisation, which is the writing of a word in the Latin alphabet...

 of the Greek
Greek language
Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

  (thēsauros), literally "treasure store", generally meaning a collection of things which are of big importance or value (and thus the medieval rank of thesaurer was a synonym for treasurer
Treasurer
A treasurer is the person responsible for running the treasury of an organization. The adjective for a treasurer is normally "tresorial". The adjective "treasurial" normally means pertaining to a treasury, rather than the treasurer.-Government:...

). This meaning has been largely supplanted by Roget's usage of the term.

Thesauri in IT

In Information Science
Information science
-Introduction:Information science is an interdisciplinary science primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval and dissemination of information...

, Library Science
Library science
Library science is an interdisciplinary or multidisciplinary field that applies the practices, perspectives, and tools of management, information technology, education, and other areas to libraries; the collection, organization, preservation, and dissemination of information resources; and the...

, and Information Technology
Information technology
Information technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...

, specialized thesauri are designed for information retrieval. They are a type of controlled vocabulary
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

, for indexing or tagging purposes. Such a thesaurus can be used as the basis of an index for online material. The Art and Architecture Thesaurus, for example, is used to index the Canadian
Information retrieval thesauri are formally organized so that existing relationships between concepts are made explicit. As a result, they are more complex than simpler controlled vocabularies such as authority lists and synonym ring
Synonym ring
In metadata a synonym ring or synset, is a group of data elements that are considered semantically equivalent for the purposes of information retrieval. These data elements are frequently found in different metadata registries...

s. Each term is placed in context, allowing a user to distinguish between "bureau" the office and "bureau" the furniture. Following international standards, they are generally arranged hierarchically by themes, topics or facets. Unlike a literary thesaurus, these specialized thesauri typically focus on one discipline, subject or field of study.

In information technology
Information technology
Information technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...

, a thesaurus represents a database or list of semantically orthogonal topical search keys. In the field of Artificial Intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

, a thesaurus may sometimes be referred to as an ontology.

Thesauri for information retrieval are typically constructed by information specialists, and have their own unique vocabulary defining different kinds of terms and relationships:

Terms
Terminology
Terminology is the study of terms and their use. Terms are words and compound words that in specific contexts are given specific meanings, meanings that may deviate from the meaning the same words have in other contexts and in everyday language. The discipline Terminology studies among other...

 are the basic semantic units for conveying concept
Concept
The word concept is used in ordinary language as well as in almost all academic disciplines. Particularly in philosophy, psychology and cognitive sciences the term is much used and much discussed. WordNet defines concept: "conception, construct ". However, the meaning of the term concept is much...

s. They are usually single-word noun
Noun
In linguistics, a noun is a member of a large, open lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition .Lexical categories are defined in terms of how their members combine with other kinds of...

s, since nouns are the most concrete part of speech. Verbs can be converted to nouns – "cleans" to "cleaning", "reads" to "reading", and so on. Adjectives and adverbs, however, seldom convey any meaning useful for indexing. When a term is ambiguous
Ambiguity
Ambiguity of words or phrases is the ability to express more than one interpretation. It is distinct from vagueness, which is a statement about the lack of precision contained or available in the information.Context may play a role in resolving ambiguity...

, a “scope note” can be added to ensure consistency, and give direction on how to interpret the term. Not every term needs a scope note, but their presence is of considerable help in using a thesaurus correctly and reaching a correct understanding of the given field of knowledge.

"Term relationships" are links between terms. These relationships can be divided into three types: hierarchical, equivalency or associative.
  • Hierarchical relationships are used to indicate terms which are narrower and broader in scope. A "Broader Term" (BT) or hyperonym is a more general term, e.g. “Apparatus” is a generalization of “Computers”. Reciprocally, a Narrower Term (NT) or hyponym is a more specific term, e.g. “Digital Computer” is a specialization of “Computer”. BT and NT are reciprocals; a broader term necessarily implies at least one other term which is narrower. BT and NT are used to indicate class relationships, as well as part-whole relationships (meronyms and holonyms).

  • The equivalency relationship is used primarily to connect synonyms and near-synonyms. Use (USE) and Used For (UF) indicators are used when an authorized term is to be used for another, unauthorized, term; for example, the entry for the authorized term "Frequency" could have the indicator "UF Pitch". Reciprocally, the entry for the unauthorized term "Pitch" would have the indicator "USE Frequency". Unauthorized terms are often called "entry vocabulary", "entry points", "lead-in terms", or "non-preferred terms", pointing to the authorized term (also referred to as the Preferred Term or Descriptor) that has been chosen to stand for the concept. As such, their presence in text can be use by automated indexing software to suggest the Preferred Term being used as an Indexing Term.

  • Associative relationships are used to connect two related terms whose relationship is neither hierarchical nor equivalent. This relationship is described by the indicator "Related Term" (RT). Associative relationships should be applied with caution, since excessive use of RTs will reduce specificity in searches. Consider the following: if the typical user is searching with term "A", would they also want resources tagged with term "B"? If the answer is no, then an associative relationship should not be established.

Literary thesauri

  • Thesaurus of English Words & Phrases (ed. P. Roget); ISBN 0-06-272037-6, see: Roget's Thesaurus
    Roget's Thesaurus
    Roget's Thesaurus is a widely-used English language thesaurus, created by Dr. Peter Mark Roget in 1805 and released to the public on 29 April 1852. The original edition had 15,000 words, and each new edition has been larger...

    .
  • World Thesaurus (ed. C. Laird); ISBN 0-671-51983-2. This edition has been used in successive editions since 1971 by Webster's:

}
  • Oxford American Desk Thesaurus (ed. C. Lindberg); ISBN 0-19-512674-2
  • Oxford Paperback Thesaurus: Third Edition; ISBN 978-0-19-861425-8
  • Random House Word Menu by Stephen Glazier; ISBN 0-679-40030-3
  • Historical Thesaurus of English
    Historical Thesaurus of English
    The Historical Thesaurus of the Oxford English Dictionary is the largest thesaurus in the world, conceived and compiled by the English Language Department of the University of Glasgow. The HTOED is a complete database of all the words in the second edition of The Oxford English Dictionary,...

     (HTE), http://www.arts.gla.ac.uk/SESLL/EngLang/thesaur/toe1.htm
  • WordNet
    WordNet
    WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym sets...

  • OpenThesaurus
    OpenThesaurus
    OpenThesaurus is an open source thesaurus project whose data is available under the GNU Lesser General Public License. It can be used directly online and with a free account users that are logged in can also add and alter entries. All entries have to be checked at least once before a release is made...

  • The Well-Spoken Thesaurus
    The Well-Spoken Thesaurus
    The Well-Spoken Thesaurus by Tom Heehler , is an American style guide and speaking aid. The Chicago Tribune calls The Well-Spoken Thesaurus "a celebration of the spoken word." The book has also been reviewed in the Winnipeg Free Press, and by bloggers at the Fayetteville Observer, and the Seattle...

     by Tom Heehler; ISBN 978-1402243059

Specialized thesauri for information retrieval

  • NAL Agricultural Thesaurus, (United States National Agricultural Library
    United States National Agricultural Library
    The United States National Agricultural Library is one of the world's largest agricultural research libraries, and serves as a National Library of the United States and as the library of the United States Department of Agriculture...

    , United States Department of Agriculture
    United States Department of Agriculture
    The United States Department of Agriculture is the United States federal executive department responsible for developing and executing U.S. federal government policy on farming, agriculture, and food...

    )
  • European Thesaurus on International Relations and Area Studies
    European Thesaurus on International Relations and Area Studies
    The European Thesaurus on International Relations and Area Studies is a multilingual, interdisciplinary thesaurus covering the subject fields of International Relations and Area Studies. The European Thesaurus consists of about 8.200 descriptors organised in 24 subdomains...

    ; ISBN 978-3-927674-11-0
  • Evaluation Thesaurus (by. M. Scriven); ISBN 0-8039-4364-4
  • Thesaurus of Psychological Index Terms (APA); ISBN 1-55798-775-0
  • Clinician's Thesaurus, (by E.Zuckerman); ISBN 1-57230-569-X
  • Art and Architecture Thesaurus, (Getty Institute)
  • Eurovoc
    Eurovoc
    Eurovoc is a multilingual thesaurus maintained by the Publications Office of the European Union. It exists in 22 official languages of the European Union , as well as Basque, Catalan,...

     Thesaurus
    , (Europa Publications Office)
  • AGROVOC
    AGROVOC
    AGROVOC was first developed in the 1980s as a multilingual structured thesaurus for all subject fields in agriculture, forestry, fisheries, food and related domains . Its main purpose was to standardize the indexing process for the AGRIS database in order to make searching simpler and more...

     Thesaurus
    , (Food and Agriculture Organization
    Food and Agriculture Organization
    The Food and Agriculture Organization of the United Nations is a specialised agency of the United Nations that leads international efforts to defeat hunger. Serving both developed and developing countries, FAO acts as a neutral forum where all nations meet as equals to negotiate agreements and...

     of the United Nations
    United Nations
    The United Nations is an international organization whose stated aims are facilitating cooperation in international law, international security, economic development, social progress, human rights, and achievement of world peace...

    )
  • GEMET - GEneral Multilingual Environmental Thesaurus, (European Environment Agency
    European Environment Agency
    European Environment Agency is an agency of the European Union. Its task is to provide sound, independent information on the environment. It is a major information source for those involved in developing, adopting, implementing and evaluating environmental policy, and also the general public...

    )
  • Medical Subject Headings
    Medical Subject Headings
    Medical Subject Headings is a comprehensive controlled vocabulary for the purpose of indexing journal articles and books in the life sciences; it can also serve as a thesaurus that facilitates searching...

    , (United States National Library of Medicine
    United States National Library of Medicine
    The United States National Library of Medicine , operated by the United States federal government, is the world's largest medical library. Located in Bethesda, Maryland, the NLM is a division of the National Institutes of Health...

    )
  • Global Legal Information Network
    Global Legal Information Network
    The Global Legal Information Network is a cooperative, not-for-profit federation of government agencies or their designees that contribute national legal information to the GLIN database. It is an automated database of statutes, regulations and related material that originate from countries in the...

     Thesaurus
    , GLIN Subject Term Index

Standards and manuals

The ANSI/NISO Z39.19 Standard of 2005 defines guidelines and conventions for the format, construction, testing, maintenance, and management of monolingual controlled vocabularies including lists, synonym rings, taxonomies, and thesauruses.

For multilingual vocabularies, the ISO 5964 Guidelines for the establishment and development of multilingual thesauri can be applied.

Thesaurus Construction and Use: a practical manual. Jean Aitchison, Allan Gilchrist and David Bawden. London and New York: Europa Publications (2000).

See also

  • AGRIS
    AGRIS
    AGRIS is a global public domain Database with 2.6 million structured bibliographical records on agricultural science and technology. The Database is maintained by FAO, and its content is provided by more than 150 participating institutions from 65 countries...

  • Controlled vocabulary
    Controlled vocabulary
    Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

  • Dictionary
    Dictionary
    A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...

  • Knowledge Organization Systems
    Knowledge Organization Systems
    Knowledge Organization Systems is a generic term used in Knowledge organization about authority lists, classification systems, thesauri, topic maps, ontologies etc.-See also:*Controlled vocabulary*Ontology...

  • Ontology (computer science)
    Ontology (computer science)
    In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to reason about the entities within that domain and may be used to describe the domain.In theory, an ontology is...

  • Simple Knowledge Organisation System

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK