Referential transparency
Encyclopedia
Referential transparency and referential opaqueness are properties of parts of computer program
s. An expression
is said to be referentially transparent if it can be replaced with its value
without changing the behavior of a program (in other words, yielding a program that has the same effects and output on the same input). The opposite term is referentially opaque.
While in mathematics
all function applications are referentially transparent, in programming this is not always the case. The importance of referential transparency is that it allows the programmer
and the compiler
to reason about program behavior. This can help in proving correctness, simplifying an algorithm
, assisting in modifying code without breaking it, or optimizing
code by means of memoization
, common subexpression elimination
or parallelization.
Referential transparency is one of the principles of functional programming
; only referentially transparent functions can be memoized (transformed into equivalent functions which cache results). Some programming language
s provide means to guarantee referential transparency.
Some functional programming languages enforce referential transparency for all functions.
As referential transparency requires the same results for a given set of inputs at any point in time, a referentially transparent expression is therefore deterministic
by definition.
s, then the expression is referentially transparent. Also, some impure functions can be included in the expression if their values are discarded and their side effect
s are insignificant.
Take a function that takes no parameters and returns input from the keyboard
. A call to this function might be
A more subtle example is that of a function that uses a global variable
(or a dynamically scope
d variable, or a lexical closure
) to help it compute its results. Since this variable is not passed as a parameter but can be altered, the results of subsequent calls to the function can differ even if the parameters are identical. In pure functional programming
, destructive assignment is not allowed; thus, a function that uses global (or dynamically scoped) variables may or may not be referentially transparent, depending on whether the global variables are immutable.
Arithmetic operations are referentially transparent:
Assignments are not transparent. For instance, the C
expression
.
, and part of the semantics of an imperative programming language.
However, because a referentially transparent expression can be evaluated at any time, it is not necessary to define sequence points nor any guarantee of the order of evaluation at all. Programming done without these considerations is called purely functional programming
.
One advantage of writing code in a referentially transparent style is that given an intelligent compiler, static code analysis
is easier and better code-improving transformations are possible automatically. For example, when programming in C, there will be a performance penalty for including a call to an expensive function inside a loop, even if the function call could be moved outside of the loop without changing the results of the program. The programmer would be forced to perform manual code motion of the call, possibly at the expense of source code readability. However, if the compiler is able to determine that the function call is referentially transparent, it can perform this transformation automatically.
The primary disadvantage of languages which enforce referential transparency is that it makes the expression of operations that naturally fit a sequence-of-steps imperative programming style more awkward and less concise. Such languages often incorporate mechanisms to make these tasks easier while retaining the purely functional quality of the language, such as definite clause grammar
s and monads
.
With referential transparency, no difference is made or recognized between a reference to a thing and the corresponding thing itself. Without referential transparency, such difference can be easily made and utilized in programs.
The function
The referential opacity of
integer p = rq(x) + rq(y) * (rq(x) - rq(x));
One may be tempted to simplify this statement to:
integer p = rq(x) + rq(y) * (0);
integer p = rq(x) + 0;
integer p = rq(x);
However, this will not work for
Such mathematical identities will hold for referentially-transparent functions such as
However, a more sophisticated analysis can be used to simplify the statement to:
integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * (x + a + 3 - (x + a + 4)); globalValue = globalValue + 4;
integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * (x + a + 3 - x - a - 4)); globalValue = globalValue + 4;
integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * -1; globalValue = globalValue + 4;
integer a = globalValue; integer p = x + a + 1 - y - a - 2; globalValue = globalValue + 4;
integer p = x - y - 1; globalValue = globalValue + 4;
This takes more steps and requires a degree of insight into the code infeasible for compiler optimization.
Therefore, referential transparency allows us to reason about our code which will lead to more robust programs, the possibility of finding bugs that we couldn't hope to find by testing, and the possibility of seeing opportunities for optimization
.
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
s. An expression
Expression (programming)
An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...
is said to be referentially transparent if it can be replaced with its value
Value (computer science)
In computer science, a value is an expression which cannot be evaluated any further . The members of a type are the values of that type. For example, the expression "1 + 2" is not a value as it can be reduced to the expression "3"...
without changing the behavior of a program (in other words, yielding a program that has the same effects and output on the same input). The opposite term is referentially opaque.
While in mathematics
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...
all function applications are referentially transparent, in programming this is not always the case. The importance of referential transparency is that it allows the programmer
Programmer
A programmer, computer programmer or coder is someone who writes computer software. The term computer programmer can refer to a specialist in one area of computer programming or to a generalist who writes code for many kinds of software. One who practices or professes a formal approach to...
and the compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
to reason about program behavior. This can help in proving correctness, simplifying an algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
, assisting in modifying code without breaking it, or optimizing
Optimization (computer science)
In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...
code by means of memoization
Memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...
, common subexpression elimination
Common subexpression elimination
In computer science, common subexpression elimination is a compiler optimization that searches for instances of identical expressions , and analyses whether it is worthwhile replacing them with a single variable holding the computed value.- Example :In the following code: a = b * c + g; d = b * c...
or parallelization.
Referential transparency is one of the principles of functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
; only referentially transparent functions can be memoized (transformed into equivalent functions which cache results). Some programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s provide means to guarantee referential transparency.
Some functional programming languages enforce referential transparency for all functions.
As referential transparency requires the same results for a given set of inputs at any point in time, a referentially transparent expression is therefore deterministic
Deterministic system (philosophy)
A deterministic system is a conceptual model of the philosophical doctrine of determinism applied to a system for understanding everything that has and will occur in the system, based on the physical outcomes of causality. In a deterministic system, every action, or cause, produces a reaction, or...
by definition.
Examples and counterexamples
If all functions involved in the expression are pure functionPure function
In computer programming, a function may be described as pure if both these statements about the function hold:# The function always evaluates the same result value given the same argument value...
s, then the expression is referentially transparent. Also, some impure functions can be included in the expression if their values are discarded and their side effect
Side effect (computer science)
In computer science, a function or expression is said to have a side effect if, in addition to returning a value, it also modifies some state or has an observable interaction with calling functions or the outside world...
s are insignificant.
Take a function that takes no parameters and returns input from the keyboard
Computer keyboard
In computing, a keyboard is a typewriter-style keyboard, which uses an arrangement of buttons or keys, to act as mechanical levers or electronic switches...
. A call to this function might be
GetInput
. The return value of GetInput
depends on what the user types in, so multiple calls to GetInput
with identical parameters (the empty list) may return different results. Therefore, GetInput
is neither determined nor referentially transparent.A more subtle example is that of a function that uses a global variable
Global variable
In computer programming, a global variable is a variable that is accessible in every scope . Interaction mechanisms with global variables are called global environment mechanisms...
(or a dynamically scope
Scope (programming)
In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics...
d variable, or a lexical closure
Closure (computer science)
In computer science, a closure is a function together with a referencing environment for the non-local variables of that function. A closure allows a function to access variables outside its typical scope. Such a function is said to be "closed over" its free variables...
) to help it compute its results. Since this variable is not passed as a parameter but can be altered, the results of subsequent calls to the function can differ even if the parameters are identical. In pure functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
, destructive assignment is not allowed; thus, a function that uses global (or dynamically scoped) variables may or may not be referentially transparent, depending on whether the global variables are immutable.
Arithmetic operations are referentially transparent:
5*5
can be replaced by 25
, for instance. In fact, all functions in the mathematical sense are referentially transparent: sin(x)
is transparent, since it will always give the same result for each particular x
.Assignments are not transparent. For instance, the C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
expression
x = x + 1
changes the value assigned to the variable x
. Assuming x
initially has value 10
, two consecutive evaluations of the expression yield, respectively, 11
and 12
. Clearly, replacing x = x + 1
with either 11
or 12
gives a program with different meaning, and so the expression is not referentially transparent. However, calling a function such as int plusone(int x) {return x+1;}
is transparent, as it will not implicitly change the input x and thus has no such side effectsSide effect (computer science)
In computer science, a function or expression is said to have a side effect if, in addition to returning a value, it also modifies some state or has an observable interaction with calling functions or the outside world...
.
today
is not transparent, as if you evaluate it and replace it by its value (say, "Jan 1, 2001"), you don't get the same result as you will if you run it tomorrow. This is because it depends on a state (the time).Contrast to imperative programming
If the substitution of an expression with its value is valid only at a certain point in the execution of the program, then the expression is not referentially transparent. The definition and ordering of these sequence points are the theoretical foundation of imperative programmingImperative programming
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...
, and part of the semantics of an imperative programming language.
However, because a referentially transparent expression can be evaluated at any time, it is not necessary to define sequence points nor any guarantee of the order of evaluation at all. Programming done without these considerations is called purely functional programming
Purely functional
Purely functional is a term in computing used to describe algorithms, data structures or programming languages that exclude destructive modifications...
.
One advantage of writing code in a referentially transparent style is that given an intelligent compiler, static code analysis
Static code analysis
Static program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...
is easier and better code-improving transformations are possible automatically. For example, when programming in C, there will be a performance penalty for including a call to an expensive function inside a loop, even if the function call could be moved outside of the loop without changing the results of the program. The programmer would be forced to perform manual code motion of the call, possibly at the expense of source code readability. However, if the compiler is able to determine that the function call is referentially transparent, it can perform this transformation automatically.
The primary disadvantage of languages which enforce referential transparency is that it makes the expression of operations that naturally fit a sequence-of-steps imperative programming style more awkward and less concise. Such languages often incorporate mechanisms to make these tasks easier while retaining the purely functional quality of the language, such as definite clause grammar
Definite clause grammar
A definite clause grammar is a way of expressing grammar, either for natural or formal languages, in a logic programming language such as Prolog. DCGs are usually associated with Prolog, but similar languages such as Mercury also include DCGs...
s and monads
Monads in functional programming
In functional programming, a monad is a programming structure that represents computations. Monads are a kind of abstract data type constructor that encapsulate program logic instead of data in the domain model...
.
With referential transparency, no difference is made or recognized between a reference to a thing and the corresponding thing itself. Without referential transparency, such difference can be easily made and utilized in programs.
Another example
As an example, let's use two functions, one which is referentially opaque, and the other which is referentially transparent:
globalValue = 0;
integer function rq(integer x)
begin
globalValue = globalValue + 1;
return x + globalValue;
end
integer function rt(integer x)
begin
return x + 1;
end
The function
rt
is referentially transparent, which means that rt(x) = rt(y)
if x = y
. For instance, rt(6) = 6 + 1 = 7, rt(4) = 4 + 1 = 5
, and so on. However, we can't say any such thing for rq
because it uses a global variable which it modifies.The referential opacity of
rq
makes reasoning about programs more difficult. For example, say we wish to reason about the following statement:integer p = rq(x) + rq(y) * (rq(x) - rq(x));
One may be tempted to simplify this statement to:
integer p = rq(x) + rq(y) * (0);
integer p = rq(x) + 0;
integer p = rq(x);
However, this will not work for
rq
because each occurrence of rq(x)
evaluates to a different value. Remember, that the return value of rq
is based on a global value which isn't passed in and which gets modified on each call to rq
. This means that mathematical identities such as no longer hold.Such mathematical identities will hold for referentially-transparent functions such as
rt
.However, a more sophisticated analysis can be used to simplify the statement to:
integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * (x + a + 3 - (x + a + 4)); globalValue = globalValue + 4;
integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * (x + a + 3 - x - a - 4)); globalValue = globalValue + 4;
integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * -1; globalValue = globalValue + 4;
integer a = globalValue; integer p = x + a + 1 - y - a - 2; globalValue = globalValue + 4;
integer p = x - y - 1; globalValue = globalValue + 4;
This takes more steps and requires a degree of insight into the code infeasible for compiler optimization.
Therefore, referential transparency allows us to reason about our code which will lead to more robust programs, the possibility of finding bugs that we couldn't hope to find by testing, and the possibility of seeing opportunities for optimization
Optimization (computer science)
In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...
.