Ontology learning - AbsoluteAstronomy.com

Ontology learning is a subtask of information extraction

Information extraction

Information extraction is a type of information retrieval whose goal is to automatically extract structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language...

. The goal of ontology

Ontology (computer science)

In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to reason about the entities within that domain and may be used to describe the domain.In theory, an ontology is...

learning is to semi-automatically extract relevant concepts and relations from a given corpus

Text corpus

In linguistics, a corpus or text corpus is a large and structured set of texts...

or other kinds of data sets to form an ontology.

The automatic creation of ontologies is a task that involves many disciplines. Typically, the process starts by extracting terms and concepts or noun phrase from plain text using a method from terminology extraction

Terminology extraction

Terminology mining, term extraction, term recognition, or glossary extraction, is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus....

. This usually involves linguistic processors (e.g. part of speech tagging

Part-of-speech tagging

In corpus linguistics, part-of-speech tagging , also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition, as well as its context—i.e...

, phrase chunking

Phrase chunking

Phrase chunking is a natural language process that separates and segments a sentence into its subconstituents, such as noun, verb, and prepositional phrases.-External links:**...

). Then statistical

or symbolic
techniques are used to extract relation signatures.The intentional aspects of domain are formalized by Ontology.Extensional part is commanded by the knowledge based on instances of concepts and relations on the basis of ontology. For instance, these approaches try to detect that "to eat" denotes a relation between a concept denoted by "animal" and a concept denoted by "food". Recently, a graph-based approach has been proposed which extracts a domain taxonomy

Taxonomy

Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

- i.e., the backbone of an ontology - from scratch.

See also