Parboiled (Java)
Encyclopedia
parboiled is an open-source Java
library released under an Apache License
. It provides support for defining PEG
parsers directly in Java source code.
parboiled is commonly used as an alternative for regular expressions or parser generators (like ANTLR
or JavaCC
), especially for smaller and medium-size applications.
Apart from providing the constructs for grammar definition parboiled implements a complete recursive descent parser
with support for abstract syntax tree
construction, parse error reporting and parse error recovery.
phase and there is no special syntax to learn for grammar definition parboiled makes it comparatively easy to build custom parsers quickly.
Consider this the following classic “calculator” example, with these rules in a simple pseudo notation
With parboiled this rule description can be translated directly into the following Java code:
The class defines the parser rules for the language (yet without any actions), which could be used to parse actual input with code such as this:
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
library released under an Apache License
Apache License
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....
. It provides support for defining PEG
Parsing expression grammar
A parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language...
parsers directly in Java source code.
parboiled is commonly used as an alternative for regular expressions or parser generators (like ANTLR
ANTLR
In computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...
or JavaCC
JavaCC
JavaCC is an open source parser generator and lexical analyzer generator for the Java programming language. JavaCC is similar to yacc in that it generates a parser from a formal grammar written in EBNF notation, except the output is Java source code...
), especially for smaller and medium-size applications.
Apart from providing the constructs for grammar definition parboiled implements a complete recursive descent parser
Recursive descent parser
A recursive descent parser is a top-down parser built from a set of mutually-recursive procedures where each such procedure usually implements one of the production rules of the grammar...
with support for abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...
construction, parse error reporting and parse error recovery.
Example
Since parsing with parboiled does not require a separate lexingLexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...
phase and there is no special syntax to learn for grammar definition parboiled makes it comparatively easy to build custom parsers quickly.
Consider this the following classic “calculator” example, with these rules in a simple pseudo notation
- Expression ← Term ((‘+’ / ‘-’) Term)*
- Term ← Factor (('*' / '/') Factor)*
- Factor ← Number / '(' Expression ')'
- Number ← [0-9]+
With parboiled this rule description can be translated directly into the following Java code:
The class defines the parser rules for the language (yet without any actions), which could be used to parse actual input with code such as this:
See also
- Parsing expression grammarParsing expression grammarA parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language...
s - Regular expressions
- ANTLRANTLRIn computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...
- JavaCCJavaCCJavaCC is an open source parser generator and lexical analyzer generator for the Java programming language. JavaCC is similar to yacc in that it generates a parser from a formal grammar written in EBNF notation, except the output is Java source code...