Thunk (functional programming)
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a thunk (also suspension, suspended computation or delayed computation) is a parameterless
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...

 closure
Closure (computer science)
In computer science, a closure is a function together with a referencing environment for the non-local variables of that function. A closure allows a function to access variables outside its typical scope. Such a function is said to be "closed over" its free variables...

 created to prevent the evaluation of an expression until forced at a later time. In lazy
Lazy evaluation
In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...

 languages thunks are created and forced implicitly. In eager languages thunks can be created explicitly by wrapping an expression in a parameterless anonymous function
Anonymous function
In programming language theory, an anonymous function is a function defined, and possibly called, without being bound to an identifier. Anonymous functions are convenient to pass as an argument to a higher-order function and are ubiquitous in languages with first-class functions such as Haskell...

 and forced by calling it in order to emulate lazy evaluation. Some functional languages require functions to take at least one parameter, requiring the anonymous function to take a dummy parameter, usually of unit type
Unit type
In the area of mathematical logic, and computer science known as type theory, a unit type is a type that allows only one value . The carrier associated with a unit type can be any singleton set. There is an isomorphism between any two such sets, so it is customary to talk about the unit type and...

, instead.

Call by name

Implementations of the call by name and call by need evaluation strategies
Evaluation strategy
In computer science, an evaluation strategy is a set of rules for evaluating expressions in a programming language. Emphasis is typically placed on functions or operators: an evaluation strategy defines when and in what order the arguments to a function are evaluated, when they are substituted...

 typically use a thunk to refer to an incompletely evaluated expression. In this context, the thunk is simply a placeholder for a part of computation which, when executed, returns either the fully evaluated expression, or returns a tag, which shows that there is a part of the expression which is yet to be instantiated, in which case blind, mindless computation cannot yet be performed (because, for example, some variable in the above-mentioned algebraic formula is not yet mapped to a number — hence the formula does not yet reduce to a concrete number).

Call by need

In call-by-need, the thunk is replaced by its return value after its first complete execution. In languages with late binding (for example, Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

), the "computation" performed by the thunk may include a lookup in the run-time context of the program to determine the current binding of a variable.

Thunks were invented by Peter Zilahy Ingerman in 1961. According to the inventor, the name "thunk" came about because it could be optimized by the compiler by "thinking about it", so "thunk", according to its inventor, "is the past tense of 'think' at two in the morning"

An early implementation of thunks for call-by-name was in Algol 60
ALGOL 60
ALGOL 60 is a member of the ALGOL family of computer programming languages. It gave rise to many other programming languages, including BCPL, B, Pascal, Simula, C, and many others. ALGOL 58 introduced code blocks and the begin and end pairs for delimiting them...

.

begin
integer idx;
real procedure sum (i, lo, hi, term);
value lo, hi;
integer i, lo, hi;
real term;
comment term is passed by-name, and so is i;
begin
real temp;
temp := 0;
for i := lo step 1 until hi do
temp := temp + term;
sum := temp
end;
print (sum (idx, 1, 100, 1/idx))
end

The above example (see Jensen's Device
Jensen's Device
Jensen's Device is a computer programming technique devised by Danish computer scientist Jørn Jensen, who worked with Peter Naur at Regnecentralen, particularly on the GIER Algol compiler, one of the earliest correct implementations of ALGOL 60....

) relies on the fact that the actual parameters idx and 1/idx are passed "by name", so that the program is equivalent to the "inlined" version

begin
integer idx;
real sum;
begin
real temp;
temp := 0;
for idx := 1 step 1 until 100 do
temp := temp + 1/idx;
sum := temp
end;
print (sum)
end

Notice that the expression 1/idx is not evaluated at the point of the call to sum; instead, it is evaluated anew each time the formal parameter term is mentioned in the definition of sum. A compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

using thunks to implement call-by-name would process the original code as if it had been written using function pointers or lambdas (represented below in Algol-like pseudocode):

begin
integer idx;
real procedure sum (i_lvalue, lo, hi, term_rvalue);
value lo, hi;
integer lo, hi;
thunk i_lvalue;
thunk
term_rvalue;
begin
real temp;
temp := 0;
for i_lvalue := lo step 1 until hi do
temp := temp + term_rvalue;
sum := temp
end;
procedure lvalue_of_idx ;
begin
lvalue_of_idx := address of idx
end;
procedure rvalue_of_1_over_idx ;
begin
rvalue_of_1_over_idx := 1/idx
end;
print (sum (lvalue_of_idx, 1, 100, rvalue_of_1_over_idx))
end

The procedures lvalue_of_idx and rvalue_of_1_over_idx would be generated automatically by the compiler whenever a call-by-name actual parameter was encountered. These automatically generated procedures would be called thunks.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK