Multi-pass compiler
Encyclopedia
A multi-pass compiler is a type of compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 that processes the source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 or abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

 of a program several times. This is in contrast to a one-pass compiler
One-pass compiler
In computer programming, a one-pass compiler is a compiler that passes through the source code of each compilation unit only once. In other words, a one-pass compiler does not "look back" at code it previously processed. Another term sometimes used is narrow compiler, which emphasizes the limited...

, which traverses the program only once. Each pass takes the result of the previous pass as the input, and creates an intermediate output. In this way, the (intermediate) code is improved pass by pass, until the final pass emits the final code.

Multi-pass compilers are sometimes called wide compilers, referring to the greater scope of the passes: they can "see" the entire program being compiled, instead of just a small portion of it. The wider scope thus available to these compilers allows better code generation (e.g. smaller code size, faster code) compared to the output of one-pass compilers, at the cost of higher compiler time and memory consumption. In addition, some languages
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 cannot be compiled in a single pass, as a result of their design.

Lexical analysis

This stage of a multi-pass compiler is to remove irrelevant information from the source program that syntax analysis will not be able to use or interpret. Irrelevant information could include things like comments and white space. In addition to removing the irrelevant information, the lexical analysis determines the lexical tokens of the language.

Syntax analysis

Syntax analysis is responsible for looking at syntax rules of the language (often as a context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....

), and building some intermediate representation of the language. An example of this intermediate representation could be something like a Abstract Syntax Tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

 or a Directed Acyclic Graph
Directed acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...

.

Semantic analysis

Semantic analysis takes the representation made from syntax analysis and applies semantic rules to the representation to make sure that the program meets the semantic rules requirements of the language. For example, in the example below at the stage of semantic analysis if the language required that conditions on if statements were boolean expressions the cond would be type-checked to make sure it would be a valid boolean expression.


if(cond) {
...
} else {
...
}


In addition to performing semantic analysis at this stage of compilation, often symbol tables are created in order to assist in code generation.

Code generation

This final stage of a typical compiler converts the intermediate representation of program into an executable set of instructions (often assembly
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

). This last stage in the only stage in compilation that is machine dependent. There can also be optimization
Optimization (computer science)
In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...

 done at this stage of compilation that make the program more efficient.

Advantages of Multi-pass compilers

Machine Independent: Since the multiple passes include a modular structure, and the code generation decoupled from the other steps of the compiler, the passes can be reused for different hardware/machines.

More Expressive Languages: Many programming languages cannot be represented with a single pass compilers, for example Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

 can be implemented with a single pass compiler since it requires that all definitions come before their use.



var x:Integer;
procedure inc;
begin
x:=x+1;
end;
x:=0;
inc;



This in the single pass compiler when x is referenced, the semantic analysis and code generation can be done since the compiler already knows from the declaration of x:
  1. where the value of x has been stored
  2. x's type


Languages like Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

require a multi-pass compiler since the definition of x would not be required to come before the use.



public class Example {
public static void main(String [] args) {
assert(x

0);
x++;
assert(x

1);
}
static int x=0;
}



The multi-pass compiler would allocate the memory location for x during the semantic analysis phase of compilation and would have processed the declaration of x before its use.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK