Tail call
Encyclopedia
In computer science
, a tail call is a subroutine
call that happens inside another procedure and that produces a return value, which is then immediately returned
by the calling procedure. The call site
is then said to be in tail position, i.e. at the end of the calling procedure. If a subroutine performs a tail call to itself, it is called tail-recursive. This is a special case of recursion
.
Tail calls are significant because they can be implemented without adding a new stack frame to the call stack
. Most of the frame of the current procedure is not needed any more, and it can be replaced by the frame of the tail call, modified as appropriate (similar to overlay
for processes, but for function calls). The program can then jump to the called subroutine. Producing such code instead of a standard call sequence is called tail call elimination, or tail call optimization.
Traditionally, tail call elimination is optional. However, in functional programming languages, tail call elimination is often guaranteed by the language standard, and this guarantee allows using recursion
, in particular tail recursion, in place of loops. In such cases, it is not correct to refer to it as an optimization, even if it is customary practice.
, so that it can return to that location with the result once the call is complete. Typically, this information is saved on the call stack
, a simple list of return locations in order of the times that the call locations they describe were reached. For tail calls, there is no need to remember the place we are calling from — instead, we can perform tail call elimination by leaving the stack alone (except possibly for function arguments and local variable
s), and the newly called function will return its result directly to the original caller. Note that the tail call doesn't have to appear lexically after all other statements in the source code; it is only important that its result be immediately returned, since the calling function will never get a chance to do anything after the call if the optimization is performed.
For non-recursive function calls, this is usually an optimization that saves little time and space, since there are not that many different functions available to call. When dealing with recursive or mutually recursive functions where recursion happens through tail calls, however, the stack space and the number of returns saved can grow to be very significant, since a function can call itself, directly or indirectly, many times. In fact, it often asymptotically reduces stack space requirements from linear, or O(n), to constant, or O(1). Tail call elimination is thus required by the standard definitions of some programming languages, such as Scheme, languages in the ML family, and Haskell
. In the case of Scheme, the language definition formalizes the intuitive notion of tail position exactly, by specifying which syntactic forms allow having results in tail context. Implementations allowing an unlimited number of tail calls to be active at the same moment, thanks to tail call elimination, can also be called 'properly tail-recursive'.
Besides space and execution efficiency, tail call elimination is important in the functional programming
idiom known as continuation passing style (CPS), which would otherwise quickly run out of stack space.
function foo(data)
A(data);
return B(data);
Here, both
function bar(data)
if A(data)
return B(data);
else
return C(data);
Here, both calls to
Now consider this code:
function foo1(data)
return A(data) + 1;
function foo2(data)
var ret = A(data);
return ret;
function foo3(data)
var ret = A(data);
return (ret 0) ? 1 : ret;
Here, the call to
Example programs
Take this Scheme program as an example:
This program is not written in a tail recursion style. Now take this Scheme program as an example:
The inner procedure
to reorganize the execution which would ordinarily look like this:
call factorial (3)
call fact (3 1)
call fact (2 3)
call fact (1 6)
call fact (0 6)
return 6
return 6
return 6
return 6
return 6
into the more efficient
variant, in terms of space:
call factorial (3)
replace arguments with (3 1), jump to "fact"
replace arguments with (2 3), jump to "fact"
replace arguments with (1 6), jump to "fact"
replace arguments with (0 6), jump to "fact"
return 6
This reorganization saves space because no state except for the calling function's address needs to be saved, either on the stack or on the heap. This also means that the programmer need not worry about running out of stack or heap space for extremely deep recursions. It is also worth noting, in typical implementations, the tail recursive variant will be substantially faster than the other variant, but only by a constant factor, albeit a large one.
Some programmers working in functional languages will rewrite recursive code to be tail-recursive so they can take advantage of this feature. This often requires addition of an "accumulator" argument (
An example in pseudo-C follows. Suppose we have the following functions:
Function
There are possible aliasing problems but this is the basic idea.
Tail recursion modulo cons
Tail recursion modulo
cons is a generalization of tail recursion optimization introduced by David H. D. Warren
in the context of compilation
of Prolog
, seen as an explicitly set-once language. As the name suggests, it applies when the only operation left to perform after a recursive call is to prepend a known value in front of a list returned from it (or to perform a constant number of simple data-constructing operations in general), which would thus be tail call save for the said cons
operation. But prefixing a value at the start of a list on exit from a recursive call is the same as appending this value at the end of the growing list on entry into the recursive call, thus building the list as a side effect
. The following Prolog fragment illustrates the concept:
Thus such a call is transformed into creating a new list node
, setting its
As another example, consider a function in C language
that duplicates a linked list:
In this form the function is not tail-recursive, because control returns to the caller after the recursive call duplicates the rest of input list. Even though it actually allocates the head node prior to duplicating the rest, the caller still has to plug in the result from the callee into the
Note how the callee now appends to the end of the list, rather than have the caller prepend to the beginning. Characteristically for this technique, a parent frame is created here in the execution call stack, which calls (non-tail-recursively) into the tail-recursive callee which could reuse its call frame if the tail-call optimization were present in C, thus defining an iterative computation.
This properly tail-recursive implementation can be converted into explicitly iterative form:
History
In a paper delivered to the ACM
conference in Seattle in 1977, Guy L. Steele summarized the debate over the GOTO
and structured programming
, and observed that procedure calls in the tail position of a procedure can be best treated as a direct transfer of control to the called procedure, typically eliminating unnecessary stack manipulation operations. Since such "tail calls" are very common in Lisp, a language where procedure calls are ubiquitous, this form of optimization considerably reduces the cost of a procedure call compared to other implementations. Steele argued that poorly implemented procedure calls had led to an artificial perception that the GOTO was cheap compared to the procedure call. Steele further argued that "in general procedure calls may be usefully thought of as GOTO statements which also pass parameters, and can be uniformly coded as [machine code] JUMP instructions", with the machine code stack manipulation instructions "considered an optimization (rather than vice versa!)". Steele cited evidence that well optimized numerical algorithms in Lisp could execute faster than code produced by then-available commercial Fortran compilers because the cost of a procedure call in Lisp was much lower. In Scheme, a Lisp dialect developed by Steele with Gerald Jay Sussman
, tail call elimination is mandatory.
Implementation methods
Tail recursion is important to some high-level languages
, especially functional
and logic
languages and members of the Lisp
family. In these languages, tail recursion is the most commonly used way (and sometimes the only way available) of implementing iteration. The language specification of Scheme requires that tail calls are to be optimized so as not to grow the stack. Tail calls can be made explicitly in Perl
, with a variant of the "goto" statement that takes a function name:
Various implementation methods are available.
From a compiler's perspective, the first example above is initially translated into pseudo-assembly language
:
foo:
call B
call A
ret
Tail call elimination replaces the last two lines with a single jump instruction:
foo:
call B
jmp A
After subroutine
Typically, the subroutines being called need to be supplied with parameter
s. The generated code thus needs to make sure that the call frame for A is properly set up before jumping to the tail-called subroutine. For instance, on platform
s where the call stack
does not just contain the return address
, but also the parameters for the subroutine, the compiler may need to emit instructions to adjust the call stack. On such a platform, consider the code:
function foo(data1, data2)
B(data1)
return A(data2)
where
foo:
mov reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
push reg ; put data1 on stack where B expects it
call B ; B uses data1
pop ; remove data1 from stack
mov reg,[sp+data2] ; fetch data2 from stack (sp) parameter into a scratch register.
push reg ; put data2 on stack where A expects it
call A ; A uses data2
pop ; remove data2 from stack.
ret
A tail call optimizer could then change the code to:
foo:
mov reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
push reg ; put data1 on stack where B expects it
call B ; B uses data1
pop ; remove data1 from stack
mov reg,[sp+data2] ; get a copy of data2 into a scratch register
mov [sp+data1],reg ; put data2 where A expects it
jmp A ; A uses data2 and returns immediately to caller.
This changed code is more efficient both in terms of execution speed and use of stack space.
as an intermediate target code, the problem comes down to coding tail recursion in C without growing the stack. Many implementations achieve this by using a device known as a trampoline
, a piece of code that repeatedly calls functions. All functions are entered via the trampoline. When a function has to call another, instead of calling it directly it returns the address of the function to be called, the arguments to be used, and so on, to the trampoline. This ensures that the C stack does not grow and iteration can continue indefinitely.
It is possible to implement trampolining using higher-order functions
in languages that support them, such as Groovy, Visual Basic .NET
and C#.
Using a trampoline for all function calls is rather more expensive than the normal C function call, so at least one Scheme compiler, Chicken, uses a technique first described by Henry Baker
from an unpublished suggestion by Andrew Appel
, in which normal C calls are used but the stack size is checked before every call. When the stack reaches its maximum permitted size, objects on the stack are garbage-collected
using the Cheney algorithm by moving all live data into a separate heap. Following this, the stack is unwound ("popped") and the program resumes from the state saved just before the garbage collection. Baker says "Appel's method avoids making a large number of small trampoline bounces by occasionally jumping off the Empire State Building." The garbage collection ensures that mutual tail recursion can continue indefinitely. However, this approach requires that no C function call ever returns, since there is no guarantee that its caller's stack frame still exists; therefore, it involves a much more dramatic internal rewriting of the program code: continuation-passing style
.
See also
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, a tail call is a subroutine
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....
call that happens inside another procedure and that produces a return value, which is then immediately returned
Return statement
In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called, known as its return address. The return address is saved, usually on the process's call stack, as part of the operation...
by the calling procedure. The call site
Call site
In programming, a call site of a function is a line in the code which calls a function. A call site passes zero or more arguments to the function, and receives zero or more return values.-Example:...
is then said to be in tail position, i.e. at the end of the calling procedure. If a subroutine performs a tail call to itself, it is called tail-recursive. This is a special case of recursion
Recursion (computer science)
Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....
.
Tail calls are significant because they can be implemented without adding a new stack frame to the call stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
. Most of the frame of the current procedure is not needed any more, and it can be replaced by the frame of the tail call, modified as appropriate (similar to overlay
Overlay (operating system)
In operating systems, an overlay is when a process replaces itself with the code of another program. On Unix-like systems, this is accomplished with the exec system call....
for processes, but for function calls). The program can then jump to the called subroutine. Producing such code instead of a standard call sequence is called tail call elimination, or tail call optimization.
Traditionally, tail call elimination is optional. However, in functional programming languages, tail call elimination is often guaranteed by the language standard, and this guarantee allows using recursion
Recursion (computer science)
Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....
, in particular tail recursion, in place of loops. In such cases, it is not correct to refer to it as an optimization, even if it is customary practice.
Description
When a function is called, the computer must "remember" the place it was called from, the return addressReturn address
In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient with a means to determine how to respond to the sender of the message if needed....
, so that it can return to that location with the result once the call is complete. Typically, this information is saved on the call stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
, a simple list of return locations in order of the times that the call locations they describe were reached. For tail calls, there is no need to remember the place we are calling from — instead, we can perform tail call elimination by leaving the stack alone (except possibly for function arguments and local variable
Local variable
In computer science, a local variable is a variable that is given local scope. Such a variable is accessible only from the function or block in which it is declared. In programming languages with only two levels of visibility, local variables are contrasted with global variables...
s), and the newly called function will return its result directly to the original caller. Note that the tail call doesn't have to appear lexically after all other statements in the source code; it is only important that its result be immediately returned, since the calling function will never get a chance to do anything after the call if the optimization is performed.
For non-recursive function calls, this is usually an optimization that saves little time and space, since there are not that many different functions available to call. When dealing with recursive or mutually recursive functions where recursion happens through tail calls, however, the stack space and the number of returns saved can grow to be very significant, since a function can call itself, directly or indirectly, many times. In fact, it often asymptotically reduces stack space requirements from linear, or O(n), to constant, or O(1). Tail call elimination is thus required by the standard definitions of some programming languages, such as Scheme, languages in the ML family, and Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
. In the case of Scheme, the language definition formalizes the intuitive notion of tail position exactly, by specifying which syntactic forms allow having results in tail context. Implementations allowing an unlimited number of tail calls to be active at the same moment, thanks to tail call elimination, can also be called 'properly tail-recursive'.
Besides space and execution efficiency, tail call elimination is important in the functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
idiom known as continuation passing style (CPS), which would otherwise quickly run out of stack space.
Syntactic form
A tail call can be located just before the syntactical end of a subroutine:function foo(data)
A(data);
return B(data);
Here, both
A(data)
and B(data)
are calls, but B is the last thing the procedure executes before returning and is thus in tail position. However, not all tail calls are necessarily located at the syntactical end of a subroutine. Consider:function bar(data)
if A(data)
return B(data);
else
return C(data);
Here, both calls to
B
and C
are in tail position, even though the first one is not syntactically at the end of bar
's body.Now consider this code:
function foo1(data)
return A(data) + 1;
function foo2(data)
var ret = A(data);
return ret;
function foo3(data)
var ret = A(data);
return (ret 0) ? 1 : ret;
Here, the call to
A(data)
is in tail position in foo2
, but it is not in tail position either in foo1
or in foo3
, because control must return to the caller to allow it to inspect or modify the return value before returning it.Example programs
Take this Scheme program as an example:
This program is not written in a tail recursion style. Now take this Scheme program as an example:
The inner procedure
fact
calls itself last in the control flow. This allows an interpreter or compilerCompiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
to reorganize the execution which would ordinarily look like this:
call factorial (3)
call fact (3 1)
call fact (2 3)
call fact (1 6)
call fact (0 6)
return 6
return 6
return 6
return 6
return 6
into the more efficient
Algorithmic efficiency
In computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process, where the goal is to reduce...
variant, in terms of space:
call factorial (3)
replace arguments with (3 1), jump to "fact"
replace arguments with (2 3), jump to "fact"
replace arguments with (1 6), jump to "fact"
replace arguments with (0 6), jump to "fact"
return 6
This reorganization saves space because no state except for the calling function's address needs to be saved, either on the stack or on the heap. This also means that the programmer need not worry about running out of stack or heap space for extremely deep recursions. It is also worth noting, in typical implementations, the tail recursive variant will be substantially faster than the other variant, but only by a constant factor, albeit a large one.
Some programmers working in functional languages will rewrite recursive code to be tail-recursive so they can take advantage of this feature. This often requires addition of an "accumulator" argument (
acc
in the above example) to the function. In some cases (such as filtering lists) and in some languages, full tail recursion may require a function that was previously purely functional to be written such that it mutates references stored in other variables.An example in pseudo-C follows. Suppose we have the following functions:
Function
a
can be changed to:There are possible aliasing problems but this is the basic idea.
Tail recursion modulo cons
Tail recursion modulo
Modulo (jargon)
The word modulo is the Latin ablative of modulus which itself means "a small measure."It was introduced into mathematics in the book Disquisitiones Arithmeticae by Carl Friedrich Gauss in 1801...
cons is a generalization of tail recursion optimization introduced by David H. D. Warren
David H. D. Warren
David H. D. Warren is a computer scientist .In the 1970s and 1980s he worked primarily on logic programming and in particular the programming language Prolog. Warren wrote the first compiler for Prolog...
in the context of compilation
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
of Prolog
Prolog
Prolog is a general purpose logic programming language associated with artificial intelligence and computational linguistics.Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is declarative: the program logic is expressed in terms of...
, seen as an explicitly set-once language. As the name suggests, it applies when the only operation left to perform after a recursive call is to prepend a known value in front of a list returned from it (or to perform a constant number of simple data-constructing operations in general), which would thus be tail call save for the said cons
Cons
In computer programming, cons is a fundamental function in most dialects of the Lisp programming language. cons constructs memory objects which hold two values or pointers to values. These objects are referred to as cells, conses, non-atomic s-expressions , or pairs...
operation. But prefixing a value at the start of a list on exit from a recursive call is the same as appending this value at the end of the growing list on entry into the recursive call, thus building the list as a side effect
Side effect
In medicine, a side effect is an effect, whether therapeutic or adverse, that is secondary to the one intended; although the term is predominantly employed to describe adverse effects, it can also apply to beneficial, but unintended, consequences of the use of a drug.Occasionally, drugs are...
. The following Prolog fragment illustrates the concept:
Thus such a call is transformed into creating a new list node
Node (computer science)
A node is a record consisting of one or more fields that are links to other nodes, and a data field. The link and data fields are often implemented by pointers or references although it is also quite common for the data to be embedded directly in the node. Nodes are used to build linked, often...
, setting its
first
field, and then making a tail call which is also passed a pointer to where its result should be written (here, the node's rest
field).As another example, consider a function in C language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
that duplicates a linked list:
In this form the function is not tail-recursive, because control returns to the caller after the recursive call duplicates the rest of input list. Even though it actually allocates the head node prior to duplicating the rest, the caller still has to plug in the result from the callee into the
next
field. So the function is almost tail-recursive. Warren's method gives the following purely tail-recursive implementation which passes the head
node to the callee to have its next
field set by it:Note how the callee now appends to the end of the list, rather than have the caller prepend to the beginning. Characteristically for this technique, a parent frame is created here in the execution call stack, which calls (non-tail-recursively) into the tail-recursive callee which could reuse its call frame if the tail-call optimization were present in C, thus defining an iterative computation.
This properly tail-recursive implementation can be converted into explicitly iterative form:
History
In a paper delivered to the ACM
Association for Computing Machinery
The Association for Computing Machinery is a learned society for computing. It was founded in 1947 as the world's first scientific and educational computing society. Its membership is more than 92,000 as of 2009...
conference in Seattle in 1977, Guy L. Steele summarized the debate over the GOTO
Goto
goto is a statement found in many computer programming languages. It is a combination of the English words go and to. It performs a one-way transfer of control to another line of code; in contrast a function call normally returns control...
and structured programming
Structured programming
Structured programming is a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops - in contrast to using simple tests and jumps such as the goto statement which could...
, and observed that procedure calls in the tail position of a procedure can be best treated as a direct transfer of control to the called procedure, typically eliminating unnecessary stack manipulation operations. Since such "tail calls" are very common in Lisp, a language where procedure calls are ubiquitous, this form of optimization considerably reduces the cost of a procedure call compared to other implementations. Steele argued that poorly implemented procedure calls had led to an artificial perception that the GOTO was cheap compared to the procedure call. Steele further argued that "in general procedure calls may be usefully thought of as GOTO statements which also pass parameters, and can be uniformly coded as [machine code] JUMP instructions", with the machine code stack manipulation instructions "considered an optimization (rather than vice versa!)". Steele cited evidence that well optimized numerical algorithms in Lisp could execute faster than code produced by then-available commercial Fortran compilers because the cost of a procedure call in Lisp was much lower. In Scheme, a Lisp dialect developed by Steele with Gerald Jay Sussman
Gerald Jay Sussman
Gerald Jay Sussman is the Panasonic Professor of Electrical Engineering at the Massachusetts Institute of Technology . He received his S.B. and Ph.D. degrees in mathematics from MIT in 1968 and 1973 respectively. He has been involved in artificial intelligence research at MIT since 1964...
, tail call elimination is mandatory.
Implementation methods
Tail recursion is important to some high-level languages
High-level programming language
A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of...
, especially functional
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
and logic
Logic programming
Logic programming is, in its broadest sense, the use of mathematical logic for computer programming. In this view of logic programming, which can be traced at least as far back as John McCarthy's [1958] advice-taker proposal, logic is used as a purely declarative representation language, and a...
languages and members of the Lisp
Lisp programming language
Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older...
family. In these languages, tail recursion is the most commonly used way (and sometimes the only way available) of implementing iteration. The language specification of Scheme requires that tail calls are to be optimized so as not to grow the stack. Tail calls can be made explicitly in Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, with a variant of the "goto" statement that takes a function name:
goto &NAME;
Various implementation methods are available.
In assembler
For compilers generating assembly directly, tail call elimination is easy: it suffices to replace a call opcode with a jump one, after fixing parameters on the stack.From a compiler's perspective, the first example above is initially translated into pseudo-assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...
:
foo:
call B
call A
ret
Tail call elimination replaces the last two lines with a single jump instruction:
foo:
call B
jmp A
After subroutine
A
completes, it will then return directly to the return address of foo
, omitting the unnecessary ret
statement.Typically, the subroutines being called need to be supplied with parameter
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...
s. The generated code thus needs to make sure that the call frame for A is properly set up before jumping to the tail-called subroutine. For instance, on platform
Platform (computing)
A computing platform includes some sort of hardware architecture and a software framework , where the combination allows software, particularly application software, to run...
s where the call stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
does not just contain the return address
Return statement
In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called, known as its return address. The return address is saved, usually on the process's call stack, as part of the operation...
, but also the parameters for the subroutine, the compiler may need to emit instructions to adjust the call stack. On such a platform, consider the code:
function foo(data1, data2)
B(data1)
return A(data2)
where
data1
and data2
are parameters. A compiler might translate to the following pseudo assembly code:foo:
mov reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
push reg ; put data1 on stack where B expects it
call B ; B uses data1
pop ; remove data1 from stack
mov reg,[sp+data2] ; fetch data2 from stack (sp) parameter into a scratch register.
push reg ; put data2 on stack where A expects it
call A ; A uses data2
pop ; remove data2 from stack.
ret
A tail call optimizer could then change the code to:
foo:
mov reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
push reg ; put data1 on stack where B expects it
call B ; B uses data1
pop ; remove data1 from stack
mov reg,[sp+data2] ; get a copy of data2 into a scratch register
mov [sp+data1],reg ; put data2 where A expects it
jmp A ; A uses data2 and returns immediately to caller.
This changed code is more efficient both in terms of execution speed and use of stack space.
Through trampolining
However, since many Scheme compilers use CC (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
as an intermediate target code, the problem comes down to coding tail recursion in C without growing the stack. Many implementations achieve this by using a device known as a trampoline
Trampoline (computers)
In computer programming, the word trampoline has a number of meanings, and is generally associated with jumps .- Low Level Programming :...
, a piece of code that repeatedly calls functions. All functions are entered via the trampoline. When a function has to call another, instead of calling it directly it returns the address of the function to be called, the arguments to be used, and so on, to the trampoline. This ensures that the C stack does not grow and iteration can continue indefinitely.
It is possible to implement trampolining using higher-order functions
Higher-order function
In mathematics and computer science, higher-order functions, functional forms, or functionals are functions which do at least one of the following:*take one or more functions as an input*output a function...
in languages that support them, such as Groovy, Visual Basic .NET
Visual Basic .NET
Visual Basic .NET , is an object-oriented computer programming language that can be viewed as an evolution of the classic Visual Basic , which is implemented on the .NET Framework...
and C#.
Using a trampoline for all function calls is rather more expensive than the normal C function call, so at least one Scheme compiler, Chicken, uses a technique first described by Henry Baker
Henry Baker (computer scientist)
Henry Givens Baker Jr. is a computer scientist who has made contributions in garbage collection, functional programming languages, and linear logic. He was also one of the founders of Symbolics...
from an unpublished suggestion by Andrew Appel
Andrew Appel
Andrew Wilson Appel is the Eugene Higgins Professor of computer science at Princeton University, New Jersey. He is especially well-known because of his compiler books, the Modern Compiler Implementation in ML series, as well as Compiling With Continuations...
, in which normal C calls are used but the stack size is checked before every call. When the stack reaches its maximum permitted size, objects on the stack are garbage-collected
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...
using the Cheney algorithm by moving all live data into a separate heap. Following this, the stack is unwound ("popped") and the program resumes from the state saved just before the garbage collection. Baker says "Appel's method avoids making a large number of small trampoline bounces by occasionally jumping off the Empire State Building." The garbage collection ensures that mutual tail recursion can continue indefinitely. However, this approach requires that no C function call ever returns, since there is no guarantee that its caller's stack frame still exists; therefore, it involves a much more dramatic internal rewriting of the program code: continuation-passing style
Continuation-passing style
In functional programming, continuation-passing style is a style of programming in which control is passed explicitly in the form of a continuation. Gerald Jay Sussman and Guy L. Steele, Jr...
.
See also
- Course-of-values recursion
- Recursion (computer science)Recursion (computer science)Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science....
- Inline expansionInline expansionIn computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the callee. This optimization may improve time and space usage at runtime, at the possible cost of increasing the final size of the program In computing, inline...
- Leaf subroutineLeaf subroutineA leaf subroutine is a subroutine which cannot in turn call another subroutine. Some compilers can apply special program optimizations to leaf subroutines, such as the use of link registers to avoid having to push the return address on the stack....