Controlled natural language
Encyclopedia
Controlled natural languages (CNLs) are subsets of natural languages, obtained by
restricting the grammar and vocabulary in order
to reduce or eliminate ambiguity
and complexity.
Traditionally, controlled languages fall into two major types:
those that improve readability for human readers (e.g. non-native speakers),
and those that enable reliable automatic semantic analysis of the language.
The first type of languages (often called "simplified" or "technical" languages),
for example ASD Simplified Technical
English,
Caterpillar Technical English, IBM's Easy English,
are used in the industry to increase the quality of technical documentation,
and possibly simplify the (semi-)automatic translation of the documentation.
These languages restrict the writer by general rules such as "write short and grammatically simple sentences",
"use nouns instead of pronouns", "use determiners", and "use active instead of passive".
The second type of languages have a formal logical basis, i.e. they have a formal syntax
and semantics, and can be mapped to an existing formal language, such as first-order logic
.
Thus, those languages can be used as knowledge-representation
languages, and writing of
those languages is supported by fully automatic consistency and redundancy checks, query answering, etc.
Other existing controlled natural languages include:
restricting the grammar and vocabulary in order
to reduce or eliminate ambiguity
Ambiguity
Ambiguity of words or phrases is the ability to express more than one interpretation. It is distinct from vagueness, which is a statement about the lack of precision contained or available in the information.Context may play a role in resolving ambiguity...
and complexity.
Traditionally, controlled languages fall into two major types:
those that improve readability for human readers (e.g. non-native speakers),
and those that enable reliable automatic semantic analysis of the language.
The first type of languages (often called "simplified" or "technical" languages),
for example ASD Simplified Technical
English,
Caterpillar Technical English, IBM's Easy English,
are used in the industry to increase the quality of technical documentation,
and possibly simplify the (semi-)automatic translation of the documentation.
These languages restrict the writer by general rules such as "write short and grammatically simple sentences",
"use nouns instead of pronouns", "use determiners", and "use active instead of passive".
The second type of languages have a formal logical basis, i.e. they have a formal syntax
and semantics, and can be mapped to an existing formal language, such as first-order logic
First-order logic
First-order logic is a formal logical system used in mathematics, philosophy, linguistics, and computer science. It goes by many names, including: first-order predicate calculus, the lower predicate calculus, quantification theory, and predicate logic...
.
Thus, those languages can be used as knowledge-representation
Knowledge representation
Knowledge representation is an area of artificial intelligence research aimed at representing knowledge in symbols to facilitate inferencing from those knowledge elements, creating new elements of knowledge...
languages, and writing of
those languages is supported by fully automatic consistency and redundancy checks, query answering, etc.
Languages
Existing logic-based controlled natural languages include:- Attempto Controlled EnglishAttempto Controlled EnglishAttempto Controlled English is a controlled natural language, i.e. a subset of standard English with a restricted syntax and a restricted semantics described by a small set of construction and interpretation rules....
- Common Logic Controlled English (CLCE)
- Pseudo Natural Language (PNL)
- Rabbit
- PENG (Processable ENGlish)
- Restricted Natural Language Statements (RNLS)
- Semantics of Business Vocabulary and Business RulesSemantics of Business Vocabulary and Business RulesThe Semantics of Business Vocabulary and Business Rules is an adopted standard of the Object Management Group intended to be the basis for formal and detailed natural language declarative description of a complex entity, such as a business...
- ClearTalkClearTalkClearTalk is a controlled natural language -- a kind of a formal language for expressing information that is designed to be both human-readable and easily processed by a computer....
Other existing controlled natural languages include:
- ASD Simplified Technical English
- Basic EnglishBasic EnglishBasic English, also known as Simple English, is an English-based controlled language created by linguist and philosopher Charles Kay Ogden as an international auxiliary language, and as an aid for teaching English as a Second Language...
- E-PrimeE-PrimeE-Prime is a version of the English language that excludes all forms of the verb to be. E-Prime does not allow conjugations of to be , archaic forms E-Prime (short for English-Prime, sometimes denoted E′) is a version of the English language that excludes all forms of the verb to be. E-Prime does...
- GellishGellishGellish is a controlled natural language, also called a formal language, in which information and knowledge can be expressed in such a way that it is computer-interpretable, as well as system-independent. Gellish is a structured subset of natural language that is suitable for information modelling...
- NewspeakNewspeakNewspeak is a fictional language in George Orwell's novel Nineteen Eighty-Four. In the novel, it refers to the deliberately impoverished language promoted by the state. Orwell included an essay about it in the form of an appendix in which the basic principles of the language are explained...
- Controlled Language Optimized for Uniform Translation (CLOUT)
- Special EnglishSpecial EnglishSpecial English is a controlled version of the English language first used on October 19, 1959, and still presented daily by the United States broadcasting service Voice of America. World news and other programs are read one-third slower than regular VOA English. Reporters avoid idioms and use a...
- Simplified Technical Russian
- EasyEnglish
See also
- Constructed languageConstructed languageA planned or constructed language—known colloquially as a conlang—is a language whose phonology, grammar, and/or vocabulary has been consciously devised by an individual or group, instead of having evolved naturally...
- Knowledge representation and reasoning
- Natural language processingNatural language processingNatural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....
- Controlled vocabularyControlled vocabularyControlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...
- PosteditingPosteditingPostediting “is the process of improving a machine-generated translation with a minimum of manual labour”. A person who postedits is called a posteditor. The concept of postediting is linked to that of pre-editing...
- Controlled language in machine translationControlled language in machine translationUsing controlled language in machine translation poses several problems.In an automated translation, the first step in order to understand the controlled language is to know what it is and to distinguish between natural language and controlled language....
Further reading
- Akis, Jennifer Wells, and William R. Sisson. (2002) Improving Translatability: A Case Study at Sun Microsystems, Inc.Globalization Insider (Localization Industry Standards Association's e-magazine).
- Kohl, John R. (2008) Language Quality-Assurance Software: Optimizing Your Documentation for a Global Audience.Intercom 55.5 (May), pp. 6–9.
- Kohl, John R. (2007) Assisted Writing and Editing at SAS. ClientSideNews Magazine 7.8 (August): 7-10.
External links
- acrolinx Information Quality Suite – Customizable controlled language checker for many authoring environments
- ASD Simplified Technical English
- Common Logic Controlled English (CLCE)
- Controlled Language Optimized for Uniform Translation (CLOUT)
- Controlled Natural Languages (Macquarie University)
- Metalog's Pseudo Natural Language (PNL)
- Ordnance Survey's Rabbit
- PERMIS Policy Editor uses controlled English in its user interface
- Processable ENGlish (PENG)
- Simplified Technical English training, software and consultancy
- Wycliffe Associates' EasyEnglish