Control flow graph
Encyclopedia
A control flow graph in computer science is a representation
Depiction
Depiction is meaning conveyed through pictures. Basically, a picture maps an object to a two-dimensional scheme or picture plane. Pictures are made with various materials and techniques, such as painting, drawing, or prints mosaics, tapestries, stained glass, and collages of unusual and disparate...

, using graph
Graph (mathematics)
In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...

 notation, of all paths that might be traversed through a program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 during its execution
Execution (computers)
Execution in computer and software engineering is the process by which a computer or a virtual machine carries out the instructions of a computer program. The instructions in the program trigger sequences of simple actions on the executing machine...

.

Overview

In a control flow graph each node
Vertex (graph theory)
In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs...

 in the graph
Graph (mathematics)
In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...

 represents a basic block
Basic block
In computing, a basic block is a portion of the code within a program with certain desirable properties that make it highly amenable to analysis. Compilers usually decompose programs into their basic blocks as a first step in the analysis process...

, i.e. a straight-line piece of code without any jumps or jump targets; jump targets start a block, and jumps end a block. Directed edges are used to represent jumps in the control flow. There are, in most presentations, two specially designated blocks: the entry block, through which control enters into the flow graph, and the exit block, through which all control flow leaves.

The CFG is essential to many compiler optimizations and static analysis
Static code analysis
Static program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...

 tools.

Example

Consider the following fragment of code:

0: (A) t0 = read_num
1: (A) if t0 mod 2 0
2: (B) print t0 + " is even."
3: (B) goto 5
4: (C) print t0 + " is odd."
5: (D) end program

In the above, we have 4 basic blocks: A from 0 to 1, B from 2 to 3, C at 4 and D at 5. In particular, in this case, A is the "entry block", D the "exit block" and lines 4 and 5 are jump targets. A graph for this fragment has edges from A to B, A to C, B to D and C to D.
Reachability
Reachability
Reachability
In graph theory, reachability is the notion of being able to get from one vertex in a directed graph to some other vertex. Note that reachability in undirected graphs is trivial — it is sufficient to find the connected components in the graph, which can be done in linear time.- Definition :For a...

 is another graph property useful in optimization.

If a block/subgraph is not connected from the subgraph containing the entry block, that block is unreachable during any execution, and so is unreachable code
Unreachable code
Unreachable code is a computer programming term for code in the source code of a program which can never be executed because there exists no control flow path to the code from the rest of the program....

; it can be safely removed.

If the exit block is unreachable from the entry block, it indicates an infinite loop
Infinite loop
An infinite loop is a sequence of instructions in a computer program which loops endlessly, either due to the loop having no terminating condition, having one that can never be met, or one that causes the loop to start over...

. Not all infinite loops are detectable, of course; see Halting problem
Halting problem
In computability theory, the halting problem can be stated as follows: Given a description of a computer program, decide whether the program finishes running or continues to run forever...

.

Dead code and some infinite loops are possible even if the programmer didn't explicitly code that way: optimizations like constant propagation and constant folding
Constant folding
Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.- Constant folding...

 followed by jump threading
Jump threading
In computing, jump threading is a compiler optimization of one jump directly to a second jump. If the second condition is a subset or inverse of the first, it can be eliminated, or threaded through the first jump. This is easily done in a single pass through the program, following acyclic chained...

 could collapse multiple basic blocks into one, cause edges to be removed from a CFG, etc., thus possibly disconnecting parts of the graph.
Domination relationship
A block M dominates a block N if every path from the entry that reaches block N has to pass through block M. The entry block dominates all blocks.

In the reverse direction, block M postdominates block N if every path from N to the exit has to pass through block M. The exit block postdominates all blocks.

It is said that a block M immediately dominates block N if M dominates N, and there is no intervening block P such that M dominates P and P dominates N. In other words, M is the last dominator on all paths from entry to N. Each block has a unique immediate dominator.

Similarly, there is a notion of immediate postdominator : Analogous to immediate dominator.

The dominator tree is an ancillary data structure depicting the dominator relationships. There is an arc from Block M to Block N if M is an immediate dominator of N. This graph is a tree, since each block has a unique immediate dominator. This tree is rooted at the entry block. Can be calculated efficiently using Lengauer-Tarjan's algorithm.

A postdominator tree is analogous to the dominator tree. This tree is rooted at the exit block.
Special edges
For practical reasons, it is necessary to introduce some artificial kinds of edges or to process differently some kinds of edges.

A back edge is an edge that points to a block that has already been met during a depth-first (DFS
Depth-first search
Depth-first search is an algorithm for traversing or searching a tree, tree structure, or graph. One starts at the root and explores as far as possible along each branch before backtracking....

) traversal of the graph. Back edges are typical of loops.

A critical edge is an edge which is neither the only edge leaving its source block, nor the only edge entering its destination block. These edges must be split: a new block must be created in the middle of the edge, in order to insert computations on the edge without affecting any other edges.

An abnormal edge is an edge whose destination is unknown. Exception handling
Exception handling
Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....

 constructs can produce them. These edges tend to inhibit optimization.

An impossible edge (also known as a fake edge) is an edge which has been added to the graph solely to preserve the property that the exit block postdominates all blocks. It cannot ever be traversed.
Loop management
A loop header (sometimes called the entry point of the loop) is a dominator that is the target of a loop-forming back edge. The loop header dominates all blocks in the loop body.

Suppose block M is a dominator with several incoming edges, some of them being back edges (so M is a loop header). It is advantageous to several optimization passes to break M up into two blocks Mpre and Mloop. The contents of M and back edges are moved to Mloop, the rest of the edges are moved to point into Mpre, and a new edge from Mpre to Mloop is inserted (so that Mpre is the immediate dominator of Mloop). In the beginning, Mpre would be empty, but passes like loop-invariant code motion
Loop-invariant code motion
In computer programming, loop-invariant code consists of statements or expressions which can be moved outside the body of a loop without affecting the semantics of the program...

 could populate it. Mpre is called the loop pre-header, and Mloop would be the loop header.
See also

  • Flowchart
    Flowchart
    A flowchart is a type of diagram that represents an algorithm or process, showing the steps as boxes of various kinds, and their order by connecting these with arrows. This diagrammatic representation can give a step-by-step solution to a given problem. Process operations are represented in these...

  • Control flow analysis
    Control flow analysis
    Control flow analysis is a static code analysis technique for determining the control flow of a program. The control flow is expressed as a control flow graph .For many languages, the control flow of a program is explicit in a program's source code...

  • Control flow diagram
    Control flow diagram
    A control flow diagram is a diagram to describe the control flow of a business process, process or program.Control flow diagrams were developed in the 1950s, and are widely used in multiple engineering disciplines...

  • Program dependence graph
    Dependency graph
    In mathematics, computer science and digital electronics, a dependency graph is a directed graph representing dependencies of several objects towards each other...


External links

Examples
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK