Regular grammar
Encyclopedia
In theoretical computer science
, a regular grammar is a formal grammar
that describes a regular language
.
(N, Σ, P, S) such that all the production rules in P are of one of the following forms:
In a left regular grammar (also called left linear grammar), all rules obey the forms
An example of a right regular grammar G with N = {S, A}, Σ = {a, b, c}, P consists of the following rules
and S is the start symbol. This grammar describes the same language as the regular expression
a*bc*.
A regular grammar is a left or right regular grammar.
Some textbooks and articles disallow empty production rules,
and assume that the empty string is not present in languages.
Some authors call this type of grammar a right regular grammar (or right linear grammar) and the type above a strictly right regular grammar (or strictly right linear grammar).
An extended left regular grammar is one in which all rules obey one of
Some authors call this type of grammar a left regular grammar and the type above a strictly left regular grammar.
, such that the grammar generates exactly the language the automaton accepts. Hence, the left regular grammars generate exactly all regular language
s. The right regular grammars describe the reverses of all such languages, that is, exactly the regular languages as well.
Every strict right regular grammar is extended right regular, while every extended right regular grammar can be made strict by inserting new nonterminals, such that the result generates the same language; hence, extended right regular grammars generate the regular languages as well. Analogously, so do the extended left regular grammars.
If empty productions are disallowed, only all regular languages that do not include the empty string can be generated.
, but not necessarily a regular one.
What is more, such a grammar need not generate a regular language: all linear grammars can be easily brought into this form, and hence, such grammars can generate exactly all linear language
s, including nonregular ones.
For instance, the grammar G with N = {S, A}, Σ = {a, b}, P with start symbol S and rules
generates , the paradigmatic non-regular linear language.
Theoretical computer science
Theoretical computer science is a division or subset of general computer science and mathematics which focuses on more abstract or mathematical aspects of computing....
, a regular grammar is a formal grammar
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...
that describes a regular language
Regular language
In theoretical computer science and formal language theory, a regular language is a formal language that can be expressed using regular expression....
.
Strictly regular grammars
A right regular grammar (also called right linear grammar) is a formal grammarFormal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...
(N, Σ, P, S) such that all the production rules in P are of one of the following forms:
- B → a - where B is a non-terminalTerminal and nonterminal symbolsIn computer science, terminal and nonterminal symbols are the lexical elements used in specifying the production rules that constitute a formal grammar...
in N and a is a terminal in Σ - B → aC - where B and C are in N and a is in Σ
- B → ε - where B is in N and ε denotes the empty stringEmpty stringIn computer science and formal language theory, the empty string is the unique string of length zero. It is denoted with λ or sometimes Λ or ε....
, i.e. the string of length 0.
In a left regular grammar (also called left linear grammar), all rules obey the forms
- A → a - where A is a non-terminal in N and a is a terminal in Σ
- A → Ba - where A and B are in N and a is in Σ
- A → ε - where A is in N and ε is the empty string.
An example of a right regular grammar G with N = {S, A}, Σ = {a, b, c}, P consists of the following rules
- S → aS
- S → bA
- A → ε
- A → cA
and S is the start symbol. This grammar describes the same language as the regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
a*bc*.
A regular grammar is a left or right regular grammar.
Some textbooks and articles disallow empty production rules,
and assume that the empty string is not present in languages.
Extended regular grammars
An extended right regular grammar is one in which all rules obey one of- B → a - where B is a non-terminal in N and a is a terminal in Σ
- A → wB - where A and B are in N and w is in Σ*
- A → ε - where A is in N and ε is the empty string.
Some authors call this type of grammar a right regular grammar (or right linear grammar) and the type above a strictly right regular grammar (or strictly right linear grammar).
An extended left regular grammar is one in which all rules obey one of
- A → a - where A is a non-terminal in N and a is a terminal in Σ
- A → Bw - where A and B are in N and w is in Σ*
- A → ε - where A is in N and ε is the empty string.
Some authors call this type of grammar a left regular grammar and the type above a strictly left regular grammar.
Expressive power
There is a direct one-to-one correspondence between the rules of a (strictly) left regular grammar and those of a nondeterministic finite state automatonNondeterministic finite state machine
In the automata theory, a nondeterministic finite state machine or nondeterministic finite automaton is a finite state machine where from each state and a given input symbol the automaton may jump into several possible next states...
, such that the grammar generates exactly the language the automaton accepts. Hence, the left regular grammars generate exactly all regular language
Regular language
In theoretical computer science and formal language theory, a regular language is a formal language that can be expressed using regular expression....
s. The right regular grammars describe the reverses of all such languages, that is, exactly the regular languages as well.
Every strict right regular grammar is extended right regular, while every extended right regular grammar can be made strict by inserting new nonterminals, such that the result generates the same language; hence, extended right regular grammars generate the regular languages as well. Analogously, so do the extended left regular grammars.
If empty productions are disallowed, only all regular languages that do not include the empty string can be generated.
Mixing left and right regular rules
If mixing of left-regular and right-regular rules is allowed, we still have a linear grammarLinear grammar
In computer science, a grammar is linear if it is context-free and all of its productions' right hand sides have at most one nonterminal.A linear language is a language generated by some linear grammar.-Example:...
, but not necessarily a regular one.
What is more, such a grammar need not generate a regular language: all linear grammars can be easily brought into this form, and hence, such grammars can generate exactly all linear language
Linear grammar
In computer science, a grammar is linear if it is context-free and all of its productions' right hand sides have at most one nonterminal.A linear language is a language generated by some linear grammar.-Example:...
s, including nonregular ones.
For instance, the grammar G with N = {S, A}, Σ = {a, b}, P with start symbol S and rules
- S → aA
- A → Sb
- S → ε
generates , the paradigmatic non-regular linear language.
See also
- Regular expressionRegular expressionIn computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
, a compact notation for regular grammars - Prefix grammarPrefix grammarIn computer science, a prefix grammar is a type of string rewriting system, consisting of a set of string rewriting rules, and similar to a formal grammar or a semi-Thue system. What is specific about prefix grammars is not the shape of their rules, but the way in which they are applied: only...
- Chomsky hierarchyChomsky hierarchyWithin the field of computer science, specifically in the area of formal languages, the Chomsky hierarchy is a containment hierarchy of classes of formal grammars....