Universal Networking Language
Encyclopedia
Universal Networking Language (UNL) is a declarative formal language
specifically designed to represent semantic data extracted from natural language
texts. It can be used as a pivot language
in interlingual machine translation
systems or as a knowledge representation
language in information retrieval
applications.
UNL was created at the Institute of Advanced Studies of the United Nations University
, in Tokyo, and it has been developed at the UNDL Foundation, in Geneva, Switzerland, along with a large community of researchers all over the world (the so-called UNL Society).
At first glance, the UNL seems to be a multilingual machine translation system, i.e., a kind of Interlingua, to which the source texts are converted before being translated into the target languages. It can, in fact, be used for such a purpose, and very efficiently too. However, its real strength is to represent knowledge and its primary objective is to serve as an infrastructure for handling knowledge that already exists or can exist in any given language.
Nevertheless, it is important to note that at this point in time it would be foolish to state it possible to represent the “full” meaning of any word, sentence or text for any language. Subtleties of intention and interpretation make the “full meaning”, whatever concept we might have of it, too variable and subjective for any systematic treatment. The UNL avoids the pitfalls of trying to represent the “full meaning” of sentences or texts, targeting instead the “core” or “consensual” meaning that is most often attributed to them. In this sense, much of the subtlety of poetry, metaphor, figurative language, innuendo and other complex, indirect communicative behaviors is beyond the current scope and goals of the UNL. Instead, the UNL targets direct communicative behavior and literal meanings as a tangible, concrete basis for much or most of human communication in practical, day-to-day settings.
As a matter of example, the English sentence ‘The sky was blue?!’ can be represented in UNL as follows:
In the example above, "sky(icl>natural world)" and "blue(icl>color)", which represent individual concepts, are UWs; "aoj" (= attribute of an object) is a directed binary semantic relation linking the two UWs; and "@def", "@interrogative", "@past", "@exclamation" and "@entry" are attributes modifying UWs.
UWs are supposed to represent universal concepts which are expressed in English words or in any other natural language in order to be humanly readable. They consist of a "headword" (the UW root) and a "constraint list" (the UW suffix between parentheses), the latter being used to disambiguate the general concept conveyed by the former. The set of UWs is organized in an ontology-like structure (the so-called "UW System"), where upper concepts are used to disambiguate the lower ones through "icl" (= is a kind of), "iof" (= is an instance of) and "equ" (= is equal to) relations.
Relations are expected to represent semantic links between words in every existing language. They can be ontological (such as "icl" and "iof" referred to above), logical (such as "and" and "or") and thematic (such as "agt" = agent, "ins" = instrument, "tim" = time, "plc" = place, etc.). There are currently 46 relations in the UNL Specs, and they define the syntax of UNL.
Attributes represent information that cannot be conveyed by UWs and relations. Normally, they represent information on tense (".@past", "@future", etc.), reference ("@def", "@indef", etc.), modality ("@can", "@must", etc.), focus ("@topic", "@focus", etc.), and so on.
Under the UNL Program, the process of representing natural language sentences in UNL graphs is called enconverting, and the process of generating natural language sentences out of UNL graphs is called deconverting. The former, which involves natural language analysis and understanding, is supposed to be carried out semi-automatically (i.e., in a computer-aided human basis); the latter is expected to be done fully automatically.
From the very beginning, a consortium of university departments from all regions of the world has been engaged in developing the UNL. That's the UNL Society, a global-scale network of R&D teams, involving about 200 specialists in computer science and linguistics, who are at work creating the linguistic resources and developing the web structure of the UNL System. The UNL Centre provides technological support and co-ordinates the implementation of the Programme.
The Programme has already crossed important milestones. The overall architecture of the UNL System has been developed with a set of basic software and tools necessary for its functioning. These are being tested and improved. A vast amount of linguistic resources from the various native languages already under development, as well as from the UNL expression, has been accumulated in the last few years. Moreover, the technical infrastructure for expanding these resources is already in place, thus facilitating the participation of many more languages in the UNL system from now on. A growing number of scientific papers and academic dissertations on the UNL are being published every year.
The most visible accomplishment so far is the recognition by the Patent Co-operation Treaty (PCT) of the innovative character and industrial applicability of the UNL, which was obtained in May 2002 through the World Intellectual Property Organisation (WIPO). Acquiring the patent for the UNL is a completely novel achievement within the United Nations.
Formal language
A formal language is a set of words—that is, finite strings of letters, symbols, or tokens that are defined in the language. The set from which these letters are taken is the alphabet over which the language is defined. A formal language is often defined by means of a formal grammar...
specifically designed to represent semantic data extracted from natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
texts. It can be used as a pivot language
Pivot language
A pivot language, sometimes also called a bridge language, is an artificial or natural language used as an intermediary language for translation between many different languages – to translate between any pair of languages A and B, one translates A to the pivot language P, then from P to B...
in interlingual machine translation
Interlingual machine translation
Interlingual machine translation is one of the classic approaches to machine translation. In this approach, the source language, i.e. the text to be translated is transformed into an interlingua, i.e., an abstract language-independent representation. The target language is then generated from the...
systems or as a knowledge representation
Knowledge representation
Knowledge representation is an area of artificial intelligence research aimed at representing knowledge in symbols to facilitate inferencing from those knowledge elements, creating new elements of knowledge...
language in information retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...
applications.
UNL was created at the Institute of Advanced Studies of the United Nations University
United Nations University
The United Nations University is an academic arm of the United Nations established in 1973, which serves purposes and principles of the Charter of the United Nations. The UNU undertakes research into the pressing global problems of human survival, development and welfare that are the concern of...
, in Tokyo, and it has been developed at the UNDL Foundation, in Geneva, Switzerland, along with a large community of researchers all over the world (the so-called UNL Society).
Scope and Goals
The UNL is an effort to achieve a simple basis for representing the most central aspects of information and meaning in a machine- and human-language-independent form. As a language-independent formalism, the UNL aims at coding, storing, disseminating and retrieving information independently of the original language in which it was expressed. In this sense, UNL seeks to provide the tools for overcoming the language barrier in a systematic way.At first glance, the UNL seems to be a multilingual machine translation system, i.e., a kind of Interlingua, to which the source texts are converted before being translated into the target languages. It can, in fact, be used for such a purpose, and very efficiently too. However, its real strength is to represent knowledge and its primary objective is to serve as an infrastructure for handling knowledge that already exists or can exist in any given language.
Nevertheless, it is important to note that at this point in time it would be foolish to state it possible to represent the “full” meaning of any word, sentence or text for any language. Subtleties of intention and interpretation make the “full meaning”, whatever concept we might have of it, too variable and subjective for any systematic treatment. The UNL avoids the pitfalls of trying to represent the “full meaning” of sentences or texts, targeting instead the “core” or “consensual” meaning that is most often attributed to them. In this sense, much of the subtlety of poetry, metaphor, figurative language, innuendo and other complex, indirect communicative behaviors is beyond the current scope and goals of the UNL. Instead, the UNL targets direct communicative behavior and literal meanings as a tangible, concrete basis for much or most of human communication in practical, day-to-day settings.
Structure
In the UNL approach, information conveyed by natural language is represented, sentence by sentence, as a hypergraph composed of a set of directed binary labeled links (referred to as relations) between nodes or hypernodes (the Universal Words, or simply UW), which stand for concepts. UWs can also be annotated with attributes representing context information.As a matter of example, the English sentence ‘The sky was blue?!’ can be represented in UNL as follows:
In the example above, "sky(icl>natural world)" and "blue(icl>color)", which represent individual concepts, are UWs; "aoj" (= attribute of an object) is a directed binary semantic relation linking the two UWs; and "@def", "@interrogative", "@past", "@exclamation" and "@entry" are attributes modifying UWs.
UWs are supposed to represent universal concepts which are expressed in English words or in any other natural language in order to be humanly readable. They consist of a "headword" (the UW root) and a "constraint list" (the UW suffix between parentheses), the latter being used to disambiguate the general concept conveyed by the former. The set of UWs is organized in an ontology-like structure (the so-called "UW System"), where upper concepts are used to disambiguate the lower ones through "icl" (= is a kind of), "iof" (= is an instance of) and "equ" (= is equal to) relations.
Relations are expected to represent semantic links between words in every existing language. They can be ontological (such as "icl" and "iof" referred to above), logical (such as "and" and "or") and thematic (such as "agt" = agent, "ins" = instrument, "tim" = time, "plc" = place, etc.). There are currently 46 relations in the UNL Specs, and they define the syntax of UNL.
Attributes represent information that cannot be conveyed by UWs and relations. Normally, they represent information on tense (".@past", "@future", etc.), reference ("@def", "@indef", etc.), modality ("@can", "@must", etc.), focus ("@topic", "@focus", etc.), and so on.
Under the UNL Program, the process of representing natural language sentences in UNL graphs is called enconverting, and the process of generating natural language sentences out of UNL graphs is called deconverting. The former, which involves natural language analysis and understanding, is supposed to be carried out semi-automatically (i.e., in a computer-aided human basis); the latter is expected to be done fully automatically.
History
The UNL Programme started in 1996, as an initiative of the Institute of Advanced Studies of the United Nations University in Tokyo, Japan. In January 2001, the United Nations University set up an autonomous organization, the UNDL Foundation, to be responsible for the development and management of the UNL Programme. The Foundation, a non-profit international organisation, has an independent identity from the United Nations University, although it has special links with the UN. It inherited from the UNU/IAS the mandate of implementing the UNL Programme so that it can fulfil its mission. Its headquarters are based in Geneva, Switzerland.From the very beginning, a consortium of university departments from all regions of the world has been engaged in developing the UNL. That's the UNL Society, a global-scale network of R&D teams, involving about 200 specialists in computer science and linguistics, who are at work creating the linguistic resources and developing the web structure of the UNL System. The UNL Centre provides technological support and co-ordinates the implementation of the Programme.
The Programme has already crossed important milestones. The overall architecture of the UNL System has been developed with a set of basic software and tools necessary for its functioning. These are being tested and improved. A vast amount of linguistic resources from the various native languages already under development, as well as from the UNL expression, has been accumulated in the last few years. Moreover, the technical infrastructure for expanding these resources is already in place, thus facilitating the participation of many more languages in the UNL system from now on. A growing number of scientific papers and academic dissertations on the UNL are being published every year.
The most visible accomplishment so far is the recognition by the Patent Co-operation Treaty (PCT) of the innovative character and industrial applicability of the UNL, which was obtained in May 2002 through the World Intellectual Property Organisation (WIPO). Acquiring the patent for the UNL is a completely novel achievement within the United Nations.
See also
- Information Economy Meta Language
- Semantic networkSemantic networkA semantic network is a network which represents semantic relations among concepts. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges.- History :...
- Abstract semantic graphAbstract semantic graphIn computer science, an abstract semantic graph is a data structure used in representing or deriving the semantics of an expression in a formal language...
- Semantic translationSemantic translationSemantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model...
- Semantic unificationSemantic unificationSemantic unification, in philosophy, linguistics, and computer science, is the process of unifying lexically different concept representations that are judged to have the same semantic content ....
External links
- UNLWEB The UNL Community Portal
- UNDL Foundation where UNL development is coordinated.
- UNL Specs
- Online book on UNL
- UNL system description