Mildly context-sensitive language
Encyclopedia
In formal grammar
theory, mildly context-sensitive languages are a class of formal languages which can be efficiently parsed, but still possess enough context sensitivity to allow the parsing
of natural language
s. The concept was first introduced by Aravind Joshi
in 1985.
The first two of these grammar classes define the same set of languages, while the last four define another single, strictly smaller class. The larger of the two classes may be parsed by thread automatons, while the other smaller one may be parsed by embedded pushdown automaton
s.
While all languages in both classes are mildly context-sensitive and both classes support some cross-serial dependency, neither class exhausts the full set of mildly context-sensitive languages.
Following are some of the properties of Level-k languages in the hierarchy:
Those properties correspond well (at least for small k > 1) to the conditions of mildly context-sensitive languages imposed by Joshi, and as k gets bigger, the language class become, in a sense, less mildly context-sensitive.
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...
theory, mildly context-sensitive languages are a class of formal languages which can be efficiently parsed, but still possess enough context sensitivity to allow the parsing
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...
of natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
s. The concept was first introduced by Aravind Joshi
Aravind Joshi
Aravind Krishna Joshi is the Henry Salvatori Professor of Computer and Cognitive Science in the computer science department of the University of Pennsylvania...
in 1985.
Definition
Mild context-sensitivity is defined in terms of sets of languages. A set of languages is mildly context-sensitive if and only if- it contains all context-free languagesContext-free languageIn formal language theory, a context-free language is a language generated by some context-free grammar. The set of all context-free languages is identical to the set of languages accepted by pushdown automata.-Examples:...
. - it admits limited cross-serial dependencies.
- all the languages are parsable in polynomial time.
- all the languages have constant growth; this means that the distribution of string lengths should be linear rather than supralinear. This is often guaranteed by proving a pumping lemmaPumping lemmaIn the theory of formal languages in computability theory, a pumping lemma or pumping argument states that, for a particular language to be a member of a language class, any sufficiently long string in the language contains a section, or sections, that can be removed, or repeated any number of...
for some class of mildly context-sensitive languages.
Formalisms
Some attempts at creating mildly context-sensitive language formalisms include:- Linear context-free rewriting systems developed by D. J. Weir
- Minimalist grammarMinimalist grammarMinimalist grammars are a class of formal grammars that aim to provide a more rigorous, usually proof-theoretic, formalization of Chomskyan Minimalist program than is normally provided in the mainstream Minimalist literature...
s of Edward P. Stabler, Alain Lecomte, Christian Retoré, etc. - Head grammarHead grammarHead Grammar is a grammar formalism introduced in Carl Pollard as an extension of the Context-free grammar class of grammars. The class of head grammars is a subset of the linear context-free rewriting systems....
s of Carl Pollard - Combinatory categorial grammarCombinatory categorial grammarCombinatory categorial grammar is an efficiently parseable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate-argument structure, quantification and information structure.CCG relies on...
s developed by Mark Steedman - Linear indexed grammars defined by Gerald Gazdar
- Tree-adjoining grammarTree-adjoining grammarTree-adjoining grammar is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol...
s developed by Aravind JoshiAravind JoshiAravind Krishna Joshi is the Henry Salvatori Professor of Computer and Cognitive Science in the computer science department of the University of Pennsylvania...
The first two of these grammar classes define the same set of languages, while the last four define another single, strictly smaller class. The larger of the two classes may be parsed by thread automatons, while the other smaller one may be parsed by embedded pushdown automaton
Embedded pushdown automaton
An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars . It is similar to the context-free grammar-parsing pushdown automaton, except that instead of using a plain stack to store symbols, it has a stack of iterated stacks that...
s.
While all languages in both classes are mildly context-sensitive and both classes support some cross-serial dependency, neither class exhausts the full set of mildly context-sensitive languages.
Control Language Hierarchy
A more precisely defined hierarchy of languages that correspond to the mildly context-sensitive class was defined by David J. Weir. Based on the work of Nabil A. Khabbaz, Weir's Control Language Hierarchy is a containment hierarchy of countable set of language classes where the Level-1 is defined as context-free, and Level-2 is the class of tree-adjoining and the other three grammars.Following are some of the properties of Level-k languages in the hierarchy:
- Level-k languages are properly contained by Level-(k + 1) language class
- Level-k languages can be parsed in time
- Level-k contains the language , but not
- Level-k contains the language , but not
Those properties correspond well (at least for small k > 1) to the conditions of mildly context-sensitive languages imposed by Joshi, and as k gets bigger, the language class become, in a sense, less mildly context-sensitive.