Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....
A parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language...
parsers directly in Java source code.
parboiled is commonly used as an alternative for regular expressions or parser generators (like ANTLR
ANTLR
In computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...
JavaCC is an open source parser generator and lexical analyzer generator for the Java programming language. JavaCC is similar to yacc in that it generates a parser from a formal grammar written in EBNF notation, except the output is Java source code...
), especially for smaller and medium-size applications.
Apart from providing the constructs for grammar definition parboiled implements a complete recursive descent parser
Recursive descent parser
A recursive descent parser is a top-down parser built from a set of mutually-recursive procedures where each such procedure usually implements one of the production rules of the grammar...
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...
construction, parse error reporting and parse error recovery.
Example
Since parsing with parboiled does not require a separate lexing
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...
phase and there is no special syntax to learn for grammar definition parboiled makes it comparatively easy to build custom parsers quickly.
Consider this the following classic “calculator” example, with these rules in a simple pseudo notation
Expression ← Term ((‘+’ / ‘-’) Term)*
Term ← Factor (('*' / '/') Factor)*
Factor ← Number / '(' Expression ')'
Number ← [0-9]+
With parboiled this rule description can be translated directly into the following Java code:
import org.parboiled.BaseParser;
public class CalculatorParser extends BaseParser
The class defines the parser rules for the language (yet without any actions), which could be used to parse actual input with code such as this:
A parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language...
In computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...
JavaCC is an open source parser generator and lexical analyzer generator for the Java programming language. JavaCC is similar to yacc in that it generates a parser from a formal grammar written in EBNF notation, except the output is Java source code...