Context-sensitive grammar
Encyclopedia
A context-sensitive grammar (CSG) is a formal grammar
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...

 in which the left-hand sides and right-hand sides of any production rules may be surrounded by a context of terminal and nonterminal symbols. Context-sensitive grammars are more general than context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....

s but still orderly enough to be parsed
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

 by a linear bounded automaton
Linear bounded automaton
In computer science, a linear bounded automaton is a restricted form of nondeterministic Turing machine.-Operation:Linear bounded automata satisfy the following three conditions:...

.

The concept of context-sensitive grammar was introduced by Noam Chomsky
Noam Chomsky
Avram Noam Chomsky is an American linguist, philosopher, cognitive scientist, and activist. He is an Institute Professor and Professor in the Department of Linguistics & Philosophy at MIT, where he has worked for over 50 years. Chomsky has been described as the "father of modern linguistics" and...

 in the 1950s as a way to describe the syntax of natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

 where it is indeed often the case that a word may or may not be appropriate in a certain place depending upon the context. A formal language
Formal language
A formal language is a set of words—that is, finite strings of letters, symbols, or tokens that are defined in the language. The set from which these letters are taken is the alphabet over which the language is defined. A formal language is often defined by means of a formal grammar...

 that can be described by a context-sensitive grammar is called a context-sensitive language
Context-sensitive language
In theoretical computer science, a context-sensitive language is a formal language that can be defined by a context-sensitive grammar. That is one of the four types of grammars in the Chomsky hierarchy...

.

Formal definition

A formal grammar
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...

 G = (N, Σ, P, S) ( this is the same as G = (V, T, P, S), just that the Non-Terminal V(ariable) is replaced by N and T(erminal) is replaced by Σ ) is context-sensitive if all rules in P are of the form
αAβ → αγβ

where AN (i.e., A is a single nonterminal), α,β ∈ (N U Σ)* (i.e., α and β are strings of nonterminals and terminals) and γ ∈ (N U Σ)+ (i.e., γ is a nonempty string of nonterminals and terminals).

Some definitions also add that for any production rule of the form u → v of a context-sensitive grammar, it shall be true that |u|≤|v|. Here |u| and |v| denote the length of the strings respectively.

In addition, a rule of the form
S → λ provided S does not appear on the right side of any rule

where λ represents the empty string
Empty string
In computer science and formal language theory, the empty string is the unique string of length zero. It is denoted with λ or sometimes Λ or ε....

 is permitted. The addition of the empty string allows the statement that the context sensitive languages are a proper superset of the context free languages, rather than having to make the weaker statement that all context free grammars with no →λ productions are also context sensitive grammars.

The name context-sensitive is explained by the α and β that form the context of A and determine whether A can be replaced with γ or not. This is different from a context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....

 where the context of a nonterminal is not taken into consideration. (Indeed, every production of a context free grammar is of the form V → w where V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (w can be empty)).

If the possibility of adding the empty string to a language is added to the strings recognized by the noncontracting grammars (which can never include the empty string) then the languages in these two definitions are identical.

Examples

  • This grammar generates the canonical non-context-free language
    Context-free language
    In formal language theory, a context-free language is a language generated by some context-free grammar. The set of all context-free languages is identical to the set of languages accepted by pushdown automata.-Examples:...

     :


The generation chain for aaa bbb ccc is:


More complicated grammars can be used to parse , and other languages with even more letters.
  • The following grammar generates the non-context-free copy language, :


The generation chain for abab is:

Normal forms

Every context-sensitive grammar which does not generate the empty string can be transformed into an equivalent one in Kuroda normal form
Kuroda normal form
In formal language theory, a grammar is in Kuroda normal form if, and only if, all production rules are of the form:Every grammar in Kuroda normal form is monotonic, and therefore, generates a context-sensitive language. Conversely, every context-sensitive language which does not generate the...

. "Equivalent" here means that the two grammars generate the same language. The normal form will not in general be context-sensitive, but will be a noncontracting grammar
Noncontracting grammar
- Formal definition :In formal language theory, a grammar is noncontracting if all of its production rules are of the formThat is, none of the rules decreases the size of the string that is being rewritten....

.

Computational properties and uses

The decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...

 that asks whether a certain string s belongs to the language of a certain context-sensitive grammar G, is PSPACE-complete
PSPACE-complete
In complexity theory, a decision problem is PSPACE-complete if it is in the complexity class PSPACE, and every problem in PSPACE can be reduced to it in polynomial time...

. There are even some context-sensitive grammars whose fixed grammar recognition problem is PSPACE-complete.

The emptiness problem for context-sensitive grammars (given a context-sensitive grammar G, is ?) is undecidable.

It has been shown that nearly all natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

s may in general be characterized by context-sensitive grammars, but the whole class of CSG's seems to be much bigger than natural languages. Worse yet, since the aforementioned decision problem for CSG's is PSPACE-complete, that makes them totally unworkable for practical use, as a polynomial-time algorithm for a PSPACE-complete problem would imply P=NP. Ongoing research on computational linguistics
Computational linguistics
Computational linguistics is an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective....

 has focused on formulating other classes of languages that are "mildly context-sensitive
Mildly context-sensitive language
In formal grammar theory, mildly context-sensitive languages are a class of formal languages which can be efficiently parsed, but still possess enough context sensitivity to allow the parsing of natural languages. The concept was first introduced by Aravind Joshi in 1985.-Definition:Mild...

" whose decision problems are feasible, such as tree-adjoining grammar
Tree-adjoining grammar
Tree-adjoining grammar is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol...

s, combinatory categorial grammar
Combinatory categorial grammar
Combinatory categorial grammar is an efficiently parseable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate-argument structure, quantification and information structure.CCG relies on...

s, coupled context-free languages, and linear context-free rewriting systems. The languages generated by these formalisms properly lie between the context-free and context-sensitive languages.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK