In compiler theory, loop dependence analysis is the task of determining whether statements within a loop body form a dependence
Dependence analysis
In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2...
, with respect to array access and modification, induction, reduction and private variables, simplification of loop-independent code and management of conditional branches inside the loop body.
Automatic parallelization, also auto parallelization, autoparallelization, or parallelization, the last one of which implies automation when used in context, refers to converting sequential code into multi-threaded or vectorized code in order to utilize multiple processors simultaneously in a...
In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Depending on context, programs may run on a single processor or on multiple separate processors...
In computer science, a normalized loop , is a loop which the loop variable starts at 0 and get incremented by one at every iteration until the exit condition is met...
of the form:
for i1 until U1 do
for i2 until U2 do
...
for in until Un do body
done
done
done
Where a is an n-dimensional array and fn, hn, etc. are functions mapping from all iteration indexes (in) to a memory access in a particular dimension of the array.
For example, in C:
for (i = 0; i < U1; i++)
for (j = 0; j < U2; j++)
a[i+4-j] = b[2*i-j] + i*j;
f1 would be i+4-j, controlling the write on the first dimension of a and f2 would be 2*i-j, controlling the read on the first dimension of b.
The scope of the problem is to find all possible dependencies between S1 and S2. To be conservative, any dependence which cannot be proven false must be assumed to be true.
Independence is shown by demonstrating that no two instances of S1 and S2 access or modify the same spot in array a. When a possible dependence is found, loop dependence analysis usually makes every attempt to characterize the relationship between dependent instances, as some optimizations may still be possible. It may also be possible to transform the loop to remove or modify the dependence.
In the course of (dis)proving such dependencies, a statement S may be decomposed according to which iteration it comes from. For instance, S[1,3,5] refers to the iteration where i1 = 1, i2 = 3 and i3 = 5. Of course, references to abstract iterations, such as S[d1+1,d2,d3], are both permitted and common.
Iteration vectors
A specific iteration through a normalized loop is referenced through an iteration vector, which encodes the state of each iteration variable.
For a loop, an iteration vector is a member of the Cartesian product
Cartesian product
In mathematics, a Cartesian product is a construction to build a new set out of a number of given sets. Each member of the Cartesian product corresponds to the selection of one element each in every one of those sets...
of the bounds for the loop variables. In the normalized form given previously, this space is defined to be U1 × U2 × ... × Un. Specific instances of statements may be parametrized by these iteration vectors, and they are also the domain of the array subscript functions found in the body of the loop. Of particular relevance, these vectors form a lexicographic order which corresponds with the chronological execution order.
Dependence Vectors
To classify data dependence, compilers use two important vectors: the distance vector (σ), which indicates the distance between fn and hn, and the direction vector (ρ), which indicates the corresponding direction, basically the sign of the distance.
The distance vector is defined as σ = (σ1 , ... , σk) where σn is σn = in - jn
The direction vector is defined as ρ = (ρ1 , ... , ρk) where ρn is:
(<) if σn > 0 => [fn < hn]
(=) if σn = 0 => [fn = hn]
(>) if σn < 0 => [fn > hn]
Classification
A dependence between two operations: a and b, can be classified according to the following criteria:
In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2...
In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2...
In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2...
In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2...
Chronological order
If Sa < Sb, this is a lexically forward dependence
If Sa = Sb, this is a self-dependence
If Sa > Sb, this is a lexically backward dependence
Loop dependence
If all distances (σ) are zero (same place in memory), this is loop independent
If at least one distance is non-zero, this is a loop carried dependence
Plausibility
Some loop dependence relations can be parallelized (or vectorized) and some cannot. Each case must be analysed separately, but as a general rule of thumb, the following table cover most cases:
ρ \ order
Lexically forward
Self-dependence
Lexically backward
positive (<)
plausible
plausible
plausible
zero (=)
implausible
δa: plausible
δf: implausible
plausible
negative (>)
implausible
implausible
implausible
Some implausible dependences can be transformed into plausible ones, for example, by means of re-arranging the statements.
Alias detection
Inside loops, the same variable can be accessed for both read and write, at the same or different location within each iteration. Not only that, but the same region in memory can be accessed via different variables. When the same region in memory can be accessed by more than one variable, you have an alias
Alias analysis
Alias analysis is a technique in compiler theory, used to determine if a storage location may be accessed in more than one way. Two pointers are said to be aliased if they point to the same location....
.
Some aliases are very simple to detect:
a = b;
for (i = 0; i < MAX; ++i)
a[i] = b[i+1];
a[MAX] = 0;
It is obvious that b is an alias to a, thus this code is actually shifting the array to the left. But this is not always so obvious, for instance the standard C library function strcpy copies one string to another, but the caller could provide overlapped regions like this:
strcpy(x, x+1);
when the internal loop could be implemented as:
while(*src != 0) {
*dst = *src;
src++; dst++;
}
The dependency of src and dst is not obvious from within the function, you have to analyse every caller to make sure there isn't any. In the case of a library function, there is no way to assure it won't happen, so the compiler has to assume one way or another. If the compiler assumes there is no alias, whenever the regions overlap, you get undefined behaviour. If it assumes there is, you always get non-optimized code for every case.
Some compilers accept special keywords to work out if it can assume no alias, such as restrict.
Techniques
Several established devices and techniques exist for tackling the loop dependence problem. For determining whether a dependence exists, the GCD test
GCD test
In compiler theory of computer science, A Greatest common divisor Test is the test used in study of loop optimization and loop dependence analysis to test the dependency between loop statements....
In compiler theory, the Banerjee test is a dependence test. The Banerjee test assumes that all loop indices are independent, however in reality, this is often not true. Banergee will always report a dependency if it exists, but it may also produce false positives .- References :* Randy Allen and...
are the most general tests in common use, while a variety of techniques exist for simpler cases.
In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2...
Alias analysis is a technique in compiler theory, used to determine if a storage location may be accessed in more than one way. Two pointers are said to be aliased if they point to the same location....
Loop splitting is a compiler optimization technique. It attempts to simplify a loop or eliminate dependencies by breaking it into multiple loops which have the same bodies but iterate over different contiguous portions of the index range.-Loop peeling:...
Loop fusion, also called loop jamming, is a compiler optimization, a loop transformation, which replaces multiple loops with a single one.- Example in C : int i, a[100], b[100]; for Loop fusion, also called loop jamming, is a compiler optimization, a loop transformation, which replaces multiple...
In compiler theory, loop interchange is the process of exchanging the order of two iteration variables.For example, in the code fragment: for i from 0 to 10 for j from 0 to 20 a[i,j] = i + jloop interchange would result in: for j from 0 to 20...
Automatic parallelization, also auto parallelization, autoparallelization, or parallelization, the last one of which implies automation when used in context, refers to converting sequential code into multi-threaded or vectorized code in order to utilize multiple processors simultaneously in a...
Vectorization (parallel computing)
The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.