GLR parser
Encyclopedia
A GLR parser is an extension of an LR parser
LR parser
In computer science, an LR parser is a parser that reads input from Left to right and produces a Rightmost derivation. The term LR parser is also used; where the k refers to the number of unconsumed "look ahead" input symbols that are used in making parsing decisions...

 algorithm to handle nondeterministic and ambiguous grammars. First described in a 1984 paper by Masaru Tomita
Masaru Tomita
is a Japanese molecular biologist and computer scientist, best known as the director of the and/or the inventor of GLR parser algorithm. He is a professor of Keio University, president of the , and the founder and board member of . He is also the co-founder and on the board of directors of .From...

, it has also been referred to as a "parallel parser". Tomita presented five stages in his original work, though, in practice, it is the second stage that is recognized as the GLR parser.

Though the algorithm has evolved since its original form, the principles have remained intact: Tomita's goal was to parse natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

 text thoroughly and efficiently. Standard LR parsers cannot accommodate the nondeterministic and ambiguous nature of natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

, and the GLR algorithm can.

Algorithm

Briefly, the GLR algorithm works in a manner similar to the LR parser
LR parser
In computer science, an LR parser is a parser that reads input from Left to right and produces a Rightmost derivation. The term LR parser is also used; where the k refers to the number of unconsumed "look ahead" input symbols that are used in making parsing decisions...

 algorithm, except that, given a particular grammar, a GLR parser will process all possible interpretations of a given input in a breadth-first search
Breadth-first search
In graph theory, breadth-first search is a graph search algorithm that begins at the root node and explores all the neighboring nodes...

. On the front-end, a GLR parser generator converts an input grammar into parser tables, in a manner similar to an LR generator. However, where LR parse tables allow for only one state transition (given a state and an input token), GLR parse tables allow for multiple transitions. In effect, GLR allows for shift/reduce and reduce/reduce conflicts.

When a conflicting transition is encountered, the parse stack is forked into two or more parallel parse stacks, where the state corresponding to each possible transition is at the top. Then, the next input token is read and used to determine the next transition(s) for each of the "top" states – and further forking can occur. If any given top state and input token do not result in at least one transition, then that "path" through the parse tables is invalid and can be discarded.

A crucial optimization allows sharing of common prefixes and suffixes of these stacks, which constrains the overall search space
Search space
Search space may refer to one of the following.*In optimization, the domain of the function to be optimized*In search algorithms of computer science, the set of all possible solutions...

 and memory usage required to parse input text. The complex structures that arise from this improvement make the search graph a directed acyclic graph
Directed acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...

 (with additional restrictions on the "depths" of various nodes), rather than a tree.

Advantages

Recognition using the GLR algorithm has the same worst-case time complexity as the CYK algorithm
CYK algorithm
The Cocke–Younger–Kasami algorithm is a parsing algorithm for context-free grammars. It employs bottom-up parsing and dynamic programming....

 and Earley algorithm – O(n3). However, GLR carries two additional advantages:
  • The time required to run the algorithm is proportional to the degree of nondeterminism in the grammar – on deterministic grammars the GLR algorithm runs in O(n) time (this is not true of the Earley and CYK algorithms, but the original Earley algorithms can be modified to ensure it)
  • The GLR algorithm is "on-line" – that is, it consumes the input tokens in a specific order and performs as much work as possible after consuming each token.


In practice, the grammars of most programming languages are deterministic or "nearly deterministic," meaning that any nondeterminism is usually resolved within a small (though possibly unbounded) number of tokens. Compared to other algorithms capable of handling the full class of context-free grammars (such as Earley or CYK), the GLR algorithm gives better performance on these "nearly deterministic" grammars, because only a single stack will be active during the majority of the parsing process.

GLR can be combined with the LALR(1) algorithm, in a hybrid parser, allowing still higher performance .

See also

  • Comparison of parser generators
    Comparison of parser generators
    This is a list of notable lexer generators and parser generators for various language classes.- Regular languages :- Deterministic context-free languages :-Parsing expression grammars, deterministic boolean grammars:...

  • ASF+SDF Meta Environment
    ASF+SDF Meta Environment
    The ASF+SDF Meta-Environment is an IDE and toolset for interactive program analysis and transformation. It combines SDF , ASF and other technologies.Some of the features:...

  • DMS Software Reengineering Toolkit
    DMS Software Reengineering Toolkit
    The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems.DMS has been...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK