Spirit Parser Framework
Encyclopedia
The Spirit Parser Framework is an object oriented recursive descent
Recursive descent parser
A recursive descent parser is a top-down parser built from a set of mutually-recursive procedures where each such procedure usually implements one of the production rules of the grammar...

 parser generator framework implemented using template metaprogramming
Metaprogramming
Metaprogramming is the writing of computer programs that write or manipulate other programs as their data, or that do part of the work at compile time that would otherwise be done at runtime...

 techniques. Expression templates
Expression templates
Expression templates is a C++ template metaprogramming technique in which templates are used to represent part of an expression. Typically, the template itself represents a particular type of operation, while the parameters represent the operands to which the operation applies. The expression...

 allow users to approximate the syntax of Extended Backus Naur Form (EBNF) completely in C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

. Parser objects are composed through operator overloading
Operator overloading
In object oriented computer programming, operator overloading—less commonly known as operator ad-hoc polymorphism—is a specific case of polymorphism, where different operators have different implementations depending on their arguments...

 and the result is a backtracking LL(∞)
LL parser
An LL parser is a top-down parser for a subset of the context-free grammars. It parses the input from Left to right, and constructs a Leftmost derivation of the sentence...

 parser that is capable of parsing rather ambiguous grammars.

Spirit can be used for both lexing and parsing, together or separately.

This framework is part of the Boost libraries.

Operators

Because of limitations of the C++ language, the syntax of Spirit has been designed around the operator precedences of C++, while bearing resemblance to both EBNF and regular expressions.
syntax explanation
x >> y Match x followed by y.
*x Match x repeated zero or more times. (This is representing the Kleene star
Kleene star
In mathematical logic and computer science, the Kleene star is a unary operation, either on sets of strings or on sets of symbols or characters. The application of the Kleene star to a set V is written as V*...

; C++ lacks a unary postfix operator
Operator (programming)
Programming languages typically support a set of operators: operations which differ from the language's functions in calling syntax and/or argument passing mode. Common examples that differ by syntax are mathematical arithmetic operations, e.g...

*)
x | y Match x. If x does not match, try to match y.
+x Match x repeated one or more times.
-x Match x zero or one time.
x & y Match x and y.
x - y Match x but not y.
x ^ y Match x or y or both in any order.
x || y Match x or y or x followed by y.
x [ function_expression ] Execute the function/functor returned by function_expression, if x matched.
( x ) Match x (can be used for priority grouping)
x % y Match one or more repetitions of x, separated by occurrences of y.
~x Match anything but x (only with character classes such as ch_p or alnum_p)

Example

  1. include
  2. include
  3. include
  4. include


using namespace std;
using namespace BOOST_SPIRIT_CLASSIC_NS;

int main
{
string input;

cout << "Input a line." << endl;
getline(cin, input);

cout << "Got '" << input << "'." << endl;

unsigned count = 0;

/*
Next line parses the input (input.c_str),
using a parser constructed with the following semantics
(identation matches source for clarity):

Zero or more occurrences of (
literal string "cat" ( when matched, increment the counter "count" )
or any character (to move on finding the next occurrence of "cat")
)
*/
parse(input.c_str,
*( str_p("cat") [ increment_a(count) ]
| anychar_p
));
/*
The parser is constructed by the compiler using operator
overloading and template matching, so the actual work is
done within spirit::parse, and the expression starting
with * only initializes the rule object that the parse
function uses.
*/

// last, show results.
cout << "The input had " << count << " occurrences of 'cat'" << endl;
return 0;
}


Of course, there are better algorithms suited for string searching,
but this example gives an idea how to construct rules and attach
actions to them.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK