Mildly context-sensitive language - AbsoluteAstronomy.com

Formal grammar

A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...

theory, mildly context-sensitive languages are a class of formal languages which can be efficiently parsed, but still possess enough context sensitivity to allow the parsing

Parsing

In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

of natural language

Natural language

In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

s. The concept was first introduced by Aravind Joshi

Aravind Joshi

Aravind Krishna Joshi is the Henry Salvatori Professor of Computer and Cognitive Science in the computer science department of the University of Pennsylvania...

in 1985.

Definition

Mild context-sensitivity is defined in terms of sets of languages. A set of languages is mildly context-sensitive if and only if

it contains all context-free languages
Context-free language
In formal language theory, a context-free language is a language generated by some context-free grammar. The set of all context-free languages is identical to the set of languages accepted by pushdown automata.-Examples:...

.
it admits limited cross-serial dependencies.
all the languages are parsable in polynomial time.
all the languages have constant growth; this means that the distribution of string lengths should be linear rather than supralinear. This is often guaranteed by proving a pumping lemma
Pumping lemma
In the theory of formal languages in computability theory, a pumping lemma or pumping argument states that, for a particular language to be a member of a language class, any sufficiently long string in the language contains a section, or sections, that can be removed, or repeated any number of...

for some class of mildly context-sensitive languages.

Formalisms

Some attempts at creating mildly context-sensitive language formalisms include:

Linear context-free rewriting systems developed by D. J. Weir
Minimalist grammar
Minimalist grammar
Minimalist grammars are a class of formal grammars that aim to provide a more rigorous, usually proof-theoretic, formalization of Chomskyan Minimalist program than is normally provided in the mainstream Minimalist literature...

s of Edward P. Stabler, Alain Lecomte, Christian Retoré, etc.
Head grammar
Head grammar
Head Grammar is a grammar formalism introduced in Carl Pollard as an extension of the Context-free grammar class of grammars. The class of head grammars is a subset of the linear context-free rewriting systems....

s of Carl Pollard
Combinatory categorial grammar
Combinatory categorial grammar
Combinatory categorial grammar is an efficiently parseable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate-argument structure, quantification and information structure.CCG relies on...

s developed by Mark Steedman
Linear indexed grammars defined by Gerald Gazdar
Tree-adjoining grammar
Tree-adjoining grammar
Tree-adjoining grammar is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol...

s developed by Aravind Joshi
Aravind Joshi
Aravind Krishna Joshi is the Henry Salvatori Professor of Computer and Cognitive Science in the computer science department of the University of Pennsylvania...

The first two of these grammar classes define the same set of languages, while the last four define another single, strictly smaller class. The larger of the two classes may be parsed by thread automatons, while the other smaller one may be parsed by embedded pushdown automaton

Embedded pushdown automaton

An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars . It is similar to the context-free grammar-parsing pushdown automaton, except that instead of using a plain stack to store symbols, it has a stack of iterated stacks that...

s.

While all languages in both classes are mildly context-sensitive and both classes support some cross-serial dependency, neither class exhausts the full set of mildly context-sensitive languages.

Control Language Hierarchy

A more precisely defined hierarchy of languages that correspond to the mildly context-sensitive class was defined by David J. Weir. Based on the work of Nabil A. Khabbaz, Weir's Control Language Hierarchy is a containment hierarchy of countable set of language classes where the Level-1 is defined as context-free, and Level-2 is the class of tree-adjoining and the other three grammars.

Following are some of the properties of Level-k languages in the hierarchy:

Level-k languages are properly contained by Level-(k + 1) language class
Level-k languages can be parsed in time
Level-k contains the language , but not
Level-k contains the language , but not

Those properties correspond well (at least for small k > 1) to the conditions of mildly context-sensitive languages imposed by Joshi, and as k gets bigger, the language class become, in a sense, less mildly context-sensitive.

External links

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.