Recursion (computer science)
Encyclopedia
Recursion in computer science
is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science.
Most computer programming languages support recursion by allowing a function to call itself within the program text. Some functional programming languages do not define any looping constructs but rely solely on recursion to repeatedly call code. Computability theory
has proven that these recursive-only languages are mathematically equivalent to the imperative languages, meaning they can solve the same kinds of problems even without the typical control structures like “while” and “for”.
s must process or generate an arbitrarily large quantity of data. Recursion is one technique for representing data whose exact size the programmer does not know: the programmer can specify this data with a self-referential definition. There are two types of self-referential definitions: inductive and coinductive
definitions.
s can be defined inductively (here, using Haskell
syntax):
The code above specifies a list of strings to be either empty, or a structure that contains a string and a list of strings. The self-reference in the definition permits the construction of lists of any (finite) number of strings.
Another example of inductive definition is the natural numbers (or non-negative integers):
Similarly recursive definitions are often used to model the structure of expressions
and statements
in programming languages. Language designers often express grammars in a syntax such as Backus-Naur form; here is such a grammar, for a simple language of arithmetic expressions with multiplication and addition:
This says that an expression is either a number, a product of two expressions, or a sum of two expressions. By recursively referring to expressions in the second and third lines, the grammar permits arbitrarily complex arithmetic expressions such as
A coinductive definition of infinite streams of strings, given informally, might look like this:
A stream of strings is an object s such that
head(s) is a string, and
tail(s) is a stream of strings.
This is very similar to an inductive definition of lists of strings; the difference is that this definition specifies how to access the contents of the data structure—namely, via the accessor functions
Corecursion
is related to coinduction, and can be used to compute particular instances of (possibly) infinite objects. As a programming technique, it is used most often in the context of lazy
programming languages, and can be preferable to recursion when the desired size or precision of a program's output is unknown. In such cases the program requires both a definition for an infinitely large (or infinitely precise) result, and a mechanism for taking a finite portion of that result. The problem of computing the first n prime numbers is one that can be solved with a corecursive program.
nalysis#Termination proof|proving termination]] of a function. All structurally-recursive functions on finite (inductively-defined) data structures can easily be shown to terminate, via structural induction
: intuitively, each recursive call receives a smaller piece of input data, until a base case is reached. Generatively-recursive functions, in contrast, do not necessarily feed smaller input to their recursive calls, so proof of their termination is not necessarily as simple, and avoiding infinite loops requires greater care.
of a natural number
:
The function can also be written as a recurrence relation
:
This evaluation of the recurrence relation demonstrates the computation that would be performed in evaluating the pseudocode above:
This factorial function can also be described without using recursion by making use of the typical looping constructs found in imperative programming languages:
The imperative code above is equivalent to this mathematical definition using an accumulator variable :
The definition above translates straightforwardly to functional programming languages such as Scheme; this is an example of iteration implemented recursively.
s:
Java language implementation:
Python language implementation:
Scheme language implementation:
JavaScript language implementation:
Common Lisp implementation:
Recurrence relation
for Fibonacci:
bn = bn-1 + bn-2
b1 = 1, b0 = 0
This Fibonacci algorithm is a particularly poor example of recursion, because each time the function is executed on a number greater than one, it makes two function calls to itself, leading to an exponential number of calls (and thus exponential time complexity
) in total. The following alternative approach uses two accumulator variables
To obtain the tenth number in the Fib. sequence, one must perform Fib(10,0,1). Where 0 is considered TwoNumbers back and 1 is considered OneNumber back. As can be seen in this approach, no trees are being created, therefore the efficiency is much greater, being a linear recursion. The recursion in condition 4, shows that OneNumber back becomes TwoNumbers back, and the new OneNumber back is calculated, simply decrementing the Times on each recursion.
Implemented in the Java or the C# programming language:
, used to compute the greatest common divisor
of two integers.
Function definition:
Recurrence relation for greatest common divisor, where expresses the remainder of :
The recursive program above is tail-recursive; it is equivalent to an iterative algorithm, and the computation shown above shows the steps of evaluation that would be performed by a language that eliminates tail calls. Below is a version of the same algorithm using explicit iteration, suitable for a language that does not eliminate tail calls. By maintaining its state entirely in the variables x and y and using a looping construct, the program avoids making recursive calls and growing the call stack.
The iterative algorithm requires a temporary variable, and even given knowledge of the Euclidean algorithm it is more difficult to understand the process by simple inspection, although the two algorithms are very similar in their steps.
Function definition:
Recurrence relation for hanoi:
Example Implementations:
Although not all recursive functions have an explicit solution, the Tower of Hanoi sequence can be reduced to an explicit formula.
Recursion is used in this algorithm because with each pass a new array is created by cutting the old one in half. The binary search procedure is then called recursively, this time on the new (and smaller) array. Typically the array's size is adjusted by manipulating a beginning and ending index. The algorithm exhibits a logarithmic order of growth because it essentially divides the problem domain in half with each pass.
Example implementation of binary search in C:
The examples in this section illustrate what is known as "structural recursion". This term refers to the fact that the recursive procedures are acting on data that is defined recursively.
Because the struct node data structure is defined recursively, procedures that operate on them can be implemented naturally as a recursive procedure. The list_print procedure defined below walks down the list until the list is empty (or NULL). For each node it prints the data element (an integer). In the C implementation, the list remains unchanged by the list_print procedure.
Operations on the tree can be implemented using recursion. Note that because there are two self-referencing pointers (left and right), tree operations may require two recursive calls:
At most two recursive calls will be made for any given call to tree_contains as defined above.
The above example illustrates an in-order traversal
of the binary tree. A Binary search tree
is a special case of the binary tree where the data elements of each node are in order.
is the only practical way to traverse and thus enumerate its contents. Traversing a filesystem is very similar to that of tree traversal
, therefore the concepts behind tree traversal are applicable to traversing a filesystem. More specifically, the code below would be an example of a preorder traversal of a filesystem.
This code blends the lines, at least somewhat, between recursion
and iteration
. It is, essentially, a recursive
implementation, which is the best way to traverse a filesystem. It is also an example of direct and indirect recursion. "rtraverse" is purely a direct example; "traverse" is the indirect, which calls "rtraverse." This example needs no "base case" scenario due to the fact that there will always be some fixed number of files and/or directories in a given filesystem.
s in use today allow the direct specification of recursive functions and procedures. When such a function is called, the program's runtime environment keeps track of the various instances of the function (often using a call stack
, although other methods may be used). Every recursive function can be transformed into an iterative function by replacing recursive calls with iterative control constructs and simulating the call stack with a stack
explicitly managed by the program.
Conversely, all iterative functions and procedures that can be evaluated by a computer (see Turing completeness
) can be expressed in terms of recursive functions; iterative control constructs such as while loop
s and do loops routinely are rewritten in recursive form in functional languages. However, in practice this rewriting depends on tail call elimination, which is not a feature of all languages. C
, Java
, and Python
are notable mainstream languages in which all function calls, including tail call
s, cause stack allocation that would not occur with the use of looping constructs; in these languages, a working iterative program rewritten in recursive form may overflow the call stack
.
is one such language.) Note the caveat below regarding the special case of tail recursion.
There are some types of problems whose solutions are inherently recursive, because of prior state they need to track. One example is tree traversal
; others include the Ackermann function
, depth-first search
, and divide-and-conquer algorithms such as Quicksort. All of these algorithms can be implemented iteratively with the help of an explicit stack
, but the programmer effort involved in managing the stack, and the complexity of the resulting program, arguably outweigh any advantages of the iterative solution.
s and hence do not build up any deferred operations. For example, the gcd function (shown again below) is tail-recursive. In contrast, the factorial function (also below) is not tail-recursive; because its recursive call is not in tail position, it builds up deferred multiplication operations that must be performed after the final recursive call completes. With a compiler
or interpreter that treats tail-recursive calls as jumps
rather than function calls, a tail-recursive function such as gcd will execute using constant space. Thus the program is essentially iterative, equivalent to using imperative language control structures like the "for" and "while" loops.
The significance of tail recursion is that when making a tail-recursive call, the caller's return position need not be saved on the call stack
; when the recursive call returns, it will branch directly on the previously saved return position. Therefore, on compilers that support tail-recursion optimization, tail recursion saves both space and time.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to many types of problems, and is one of the central ideas of computer science.
"The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions."
Most computer programming languages support recursion by allowing a function to call itself within the program text. Some functional programming languages do not define any looping constructs but rely solely on recursion to repeatedly call code. Computability theory
Computability theory (computer science)
Computability is the ability to solve a problem in an effective manner. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science...
has proven that these recursive-only languages are mathematically equivalent to the imperative languages, meaning they can solve the same kinds of problems even without the typical control structures like “while” and “for”.
Recursive data types
Many computer programComputer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
s must process or generate an arbitrarily large quantity of data. Recursion is one technique for representing data whose exact size the programmer does not know: the programmer can specify this data with a self-referential definition. There are two types of self-referential definitions: inductive and coinductive
Coinduction
In computer science, coinduction is a technique for defining and proving properties of systems of concurrent interacting objects.Coinduction is the mathematical dual to structural induction...
definitions.
Inductively-defined data
An inductively-defined recursive data definition is one that specifies how to construct instances of the data. For example, linked listLinked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...
s can be defined inductively (here, using Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
syntax):
data ListOfStrings = EmptyList | Cons String ListOfStrings
The code above specifies a list of strings to be either empty, or a structure that contains a string and a list of strings. The self-reference in the definition permits the construction of lists of any (finite) number of strings.
Another example of inductive definition is the natural numbers (or non-negative integers):
A natural number is either 1 or n+1, where n is a natural number.
Similarly recursive definitions are often used to model the structure of expressions
Expression (programming)
An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...
and statements
Statement (programming)
In computer programming a statement can be thought of as the smallest standalone element of an imperative programming language. A program written in such a language is formed by a sequence of one or more statements. A statement will have internal components .Many languages In computer programming...
in programming languages. Language designers often express grammars in a syntax such as Backus-Naur form; here is such a grammar, for a simple language of arithmetic expressions with multiplication and addition:
This says that an expression is either a number, a product of two expressions, or a sum of two expressions. By recursively referring to expressions in the second and third lines, the grammar permits arbitrarily complex arithmetic expressions such as
(5 * ((3 * 6) + 8))
, with more than one product or sum operation in a single expression.Coinductively-defined data and corecursion
A coinductive data definition is one that specifies the operations that may be performed on a piece of data; typically, self-referential coinductive definitions are used for data structures of infinite size.A coinductive definition of infinite streams of strings, given informally, might look like this:
A stream of strings is an object s such that
head(s) is a string, and
tail(s) is a stream of strings.
This is very similar to an inductive definition of lists of strings; the difference is that this definition specifies how to access the contents of the data structure—namely, via the accessor functions
head
and tail
-- and what those contents may be, whereas the inductive definition specifies how to create the structure and what it may be created from.Corecursion
Corecursion
In computer science, corecursion is a type of operation that is dual to recursion. Corecursion and codata allow total languages to work with infinite data structures such as streams. Corecursion is often used in conjunction with lazy evaluation. Corecursion can produce both finite and infinite data...
is related to coinduction, and can be used to compute particular instances of (possibly) infinite objects. As a programming technique, it is used most often in the context of lazy
Lazy evaluation
In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...
programming languages, and can be preferable to recursion when the desired size or precision of a program's output is unknown. In such cases the program requires both a definition for an infinitely large (or infinitely precise) result, and a mechanism for taking a finite portion of that result. The problem of computing the first n prime numbers is one that can be solved with a corecursive program.
Structural versus generative recursion
Some authors classify recursion as either "generative" or "structural". The distinction is related to where a recursive procedure gets the data that it works on, and how it processes that data:nalysis#Termination proof|proving termination]] of a function. All structurally-recursive functions on finite (inductively-defined) data structures can easily be shown to terminate, via structural induction
Structural induction
Structural induction is a proof method that is used in mathematical logic , computer science, graph theory, and some other mathematical fields. It is a generalization of mathematical induction...
: intuitively, each recursive call receives a smaller piece of input data, until a base case is reached. Generatively-recursive functions, in contrast, do not necessarily feed smaller input to their recursive calls, so proof of their termination is not necessarily as simple, and avoiding infinite loops requires greater care.
Factorial
A classic example of a recursive procedure is the function used to calculate the factorialFactorial
In mathematics, the factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n...
of a natural number
Natural number
In mathematics, the natural numbers are the ordinary whole numbers used for counting and ordering . These purposes are related to the linguistic notions of cardinal and ordinal numbers, respectively...
:
Pseudocode Pseudocode In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading... (recursive): |
---|
function factorial is: input: integer n such that n >= 0 output: [n × (n-1) × (n-2) × … × 1] 1. if n is 0, return 1 2. otherwise, return [ n × factorial(n-1) ] end factorial |
The function can also be written as a recurrence relation
Recurrence relation
In mathematics, a recurrence relation is an equation that recursively defines a sequence, once one or more initial terms are given: each further term of the sequence is defined as a function of the preceding terms....
:
This evaluation of the recurrence relation demonstrates the computation that would be performed in evaluating the pseudocode above:
Computing the recurrence relation for n = 4: |
---|
b4 = 4 * b3 = 4 * 3 * b2 = 4 * 3 * 2 * b1 = 4 * 3 * 2 * 1 * b0 = 4 * 3 * 2 * 1 * 1 = 4 * 3 * 2 * 1 = 4 * 3 * 2 = 4 * 6 = 24 |
This factorial function can also be described without using recursion by making use of the typical looping constructs found in imperative programming languages:
Pseudocode (iterative): |
---|
function factorial is: input: integer n such that n >= 0 output: [n × (n-1) × (n-2) × … × 1] 1. create new variable called running_total with a value of 1 2. begin loop 1. if n is 0, exit loop 2. set running_total to (running_total × n) 3. decrement n 4. repeat loop 3. return running_total end factorial |
The imperative code above is equivalent to this mathematical definition using an accumulator variable :
The definition above translates straightforwardly to functional programming languages such as Scheme; this is an example of iteration implemented recursively.
Fibonacci
Another well known mathematical recursive function is one that computes the Fibonacci numberFibonacci number
In mathematics, the Fibonacci numbers are the numbers in the following integer sequence:0,\;1,\;1,\;2,\;3,\;5,\;8,\;13,\;21,\;34,\;55,\;89,\;144,\; \ldots\; ....
s:
Pseudocode Pseudocode In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading... |
---|
function fib is: input: integer n such that n >= 0 1. if n is 0, return 0 2. if n is 1, return 1 3. otherwise, return [ fib(n-1) + fib(n-2) ] end fib |
Java language implementation:
Python language implementation:
Scheme language implementation:
JavaScript language implementation:
Common Lisp implementation:
Recurrence relation
Recurrence relation
In mathematics, a recurrence relation is an equation that recursively defines a sequence, once one or more initial terms are given: each further term of the sequence is defined as a function of the preceding terms....
for Fibonacci:
bn = bn-1 + bn-2
b1 = 1, b0 = 0
Computing the recurrence relation for n = 4: |
---|
b4 = b3 + b2 = b2 + b1 + b1 + b0 = b1 + b0 + 1 + 1 + 0 = 1 + 0 + 1 + 1 + 0 = 3 |
This Fibonacci algorithm is a particularly poor example of recursion, because each time the function is executed on a number greater than one, it makes two function calls to itself, leading to an exponential number of calls (and thus exponential time complexity
Time complexity
In computer science, the time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the size of the input to the problem. The time complexity of an algorithm is commonly expressed using big O notation, which suppresses multiplicative constants and...
) in total. The following alternative approach uses two accumulator variables
TwoBack
and OneBack
to "remember" the previous two Fibonacci numbers constructed, and so avoids the exponential time cost:Pseudocode |
---|
function fib is: input: integer Times such that Times >= 0, relative to TwoBack and OneBack long TwoBack such that TwoBack = fib(x) long OneBack such that OneBack = fib(x) 1. if Times is 0, return TwoBack 2. if Times is 1, return OneBack 3. if Times is 2, return TwoBack + OneBack 4. otherwise, return [ fib(Times-1, OneBack, TwoBack + OneBack) ] end fib |
To obtain the tenth number in the Fib. sequence, one must perform Fib(10,0,1). Where 0 is considered TwoNumbers back and 1 is considered OneNumber back. As can be seen in this approach, no trees are being created, therefore the efficiency is much greater, being a linear recursion. The recursion in condition 4, shows that OneNumber back becomes TwoNumbers back, and the new OneNumber back is calculated, simply decrementing the Times on each recursion.
Implemented in the Java or the C# programming language:
Greatest common divisor
Another famous recursive function is the Euclidean algorithmEuclidean algorithm
In mathematics, the Euclidean algorithm is an efficient method for computing the greatest common divisor of two integers, also known as the greatest common factor or highest common factor...
, used to compute the greatest common divisor
Greatest common divisor
In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...
of two integers.
Function definition:
Pseudocode Pseudocode In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading... (recursive): |
---|
function gcd is: input: integer x, integer y such that x >= y and y >= 0 1. if y is 0, return x 2. otherwise, return [ gcd( y, (remainder of x/y) ) ] end gcd |
Recurrence relation for greatest common divisor, where expresses the remainder of :
Computing the recurrence relation for x = 27 and y = 9: |
---|
gcd(27, 9) = gcd(9, 27 % 9) = gcd(9, 0) = 9 |
Computing the recurrence relation for x = 259 and y = 111: |
gcd(259, 111) = gcd(111, 259 % 111) = gcd(111, 37) = gcd(37, 0) = 37 |
The recursive program above is tail-recursive; it is equivalent to an iterative algorithm, and the computation shown above shows the steps of evaluation that would be performed by a language that eliminates tail calls. Below is a version of the same algorithm using explicit iteration, suitable for a language that does not eliminate tail calls. By maintaining its state entirely in the variables x and y and using a looping construct, the program avoids making recursive calls and growing the call stack.
Pseudocode (iterative): |
---|
function gcd is: input: integer x, integer y such that x >= y and y >= 0 1. create new variable called remainder 2. begin loop 1. if y is zero, exit loop 2. set remainder to the remainder of x/y 3. set x to y 4. set y to remainder 5. repeat loop 3. return x end gcd |
The iterative algorithm requires a temporary variable, and even given knowledge of the Euclidean algorithm it is more difficult to understand the process by simple inspection, although the two algorithms are very similar in their steps.
Towers of Hanoi
For a full discussion of this problem's description, history and solution see the main article or one of the many references. Simply put the problem is this: given three pegs, one with a set of N disks of increasing size, determine the minimum (optimal) number of steps it takes to move all the disks from their initial position to another peg without placing a larger disk on top of a smaller one.Function definition:
Recurrence relation for hanoi:
Computing the recurrence relation for n = 4: |
---|
hanoi(4) = 2*hanoi(3) + 1 = 2*(2*hanoi(2) + 1) + 1 = 2*(2*(2*hanoi(1) + 1) + 1) + 1 = 2*(2*(2*1 + 1) + 1) + 1 = 2*(2*(3) + 1) + 1 = 2*(7) + 1 = 15 |
Example Implementations:
Pseudocode Pseudocode In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading... (recursive): |
---|
function hanoi is: input: integer n, such that n >= 1 1. if n is 1 then return 1 2. return [2 * [call hanoi(n-1)] + 1] end hanoi |
Although not all recursive functions have an explicit solution, the Tower of Hanoi sequence can be reduced to an explicit formula.
An explicit formula for Towers of Hanoi: |
---|
h1 = 1 = 21 - 1 h2 = 3 = 22 - 1 h3 = 7 = 23 - 1 h4 = 15 = 24 - 1 h5 = 31 = 25 - 1 h6 = 63 = 26 - 1 h7 = 127 = 27 - 1 In general: hn = 2n - 1, for all n >= 1 |
Binary search
The binary search algorithm is a method of searching an ordered array for a single element by cutting the array in half with each pass. The trick is to pick a midpoint near the center of the array, compare the data at that point with the data being searched and then responding to one of three possible conditions: the data is found at the midpoint, the data at the midpoint is greater than the data being searched for, or the data at the midpoint is less than the data being searched for.Recursion is used in this algorithm because with each pass a new array is created by cutting the old one in half. The binary search procedure is then called recursively, this time on the new (and smaller) array. Typically the array's size is adjusted by manipulating a beginning and ending index. The algorithm exhibits a logarithmic order of growth because it essentially divides the problem domain in half with each pass.
Example implementation of binary search in C:
Recursive data structures (structural recursion)
An important application of recursion in computer science is in defining dynamic data structures such as Lists and Trees. Recursive data structures can dynamically grow to a theoretically infinite size in response to runtime requirements; in contrast, a static array's size requirements must be set at compile time.
"Recursive algorithms are particularly appropriate when the underlying problem or the data to be treated are defined in recursive terms."
The examples in this section illustrate what is known as "structural recursion". This term refers to the fact that the recursive procedures are acting on data that is defined recursively.
As long as a programmer derives the template from a data definition, functions employ structural recursion. That is, the recursions in a function's body consume some immediate piece of a given compound value.
Linked lists
Below is a simple definition of a linked list node. Notice especially how the node is defined in terms of itself. The "next" element of struct node is a pointer to another struct node, effectively creating a list type.Because the struct node data structure is defined recursively, procedures that operate on them can be implemented naturally as a recursive procedure. The list_print procedure defined below walks down the list until the list is empty (or NULL). For each node it prints the data element (an integer). In the C implementation, the list remains unchanged by the list_print procedure.
Binary trees
Below is a simple definition for a binary tree node. Like the node for linked lists, it is defined in terms of itself, recursively. There are two self-referential pointers: left (pointing to the left sub-tree) and right (pointing to the right sub-tree).Operations on the tree can be implemented using recursion. Note that because there are two self-referencing pointers (left and right), tree operations may require two recursive calls:
At most two recursive calls will be made for any given call to tree_contains as defined above.
The above example illustrates an in-order traversal
Tree traversal
In computer science, tree-traversal refers to the process of visiting each node in a tree data structure, exactly once, in a systematic way. Such traversals are classified by the order in which the nodes are visited...
of the binary tree. A Binary search tree
Binary search tree
In computer science, a binary search tree , which may sometimes also be called an ordered or sorted binary tree, is a node-based binary tree data structurewhich has the following properties:...
is a special case of the binary tree where the data elements of each node are in order.
Filesystem traversal
Since the number of files in a filesystem may vary, recursionRecursion
Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
is the only practical way to traverse and thus enumerate its contents. Traversing a filesystem is very similar to that of tree traversal
Tree traversal
In computer science, tree-traversal refers to the process of visiting each node in a tree data structure, exactly once, in a systematic way. Such traversals are classified by the order in which the nodes are visited...
, therefore the concepts behind tree traversal are applicable to traversing a filesystem. More specifically, the code below would be an example of a preorder traversal of a filesystem.
This code blends the lines, at least somewhat, between recursion
Recursion
Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
and iteration
Iteration
Iteration means the act of repeating a process usually with the aim of approaching a desired goal or target or result. Each repetition of the process is also called an "iteration," and the results of one iteration are used as the starting point for the next iteration.-Mathematics:Iteration in...
. It is, essentially, a recursive
Recursive
Recursive may refer to:*Recursion, the technique of functions calling themselves*Recursive function, a total computable function*Recursive language, a language which is decidable...
implementation, which is the best way to traverse a filesystem. It is also an example of direct and indirect recursion. "rtraverse" is purely a direct example; "traverse" is the indirect, which calls "rtraverse." This example needs no "base case" scenario due to the fact that there will always be some fixed number of files and/or directories in a given filesystem.
Expressive power
Most programming languageProgramming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s in use today allow the direct specification of recursive functions and procedures. When such a function is called, the program's runtime environment keeps track of the various instances of the function (often using a call stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
, although other methods may be used). Every recursive function can be transformed into an iterative function by replacing recursive calls with iterative control constructs and simulating the call stack with a stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...
explicitly managed by the program.
Conversely, all iterative functions and procedures that can be evaluated by a computer (see Turing completeness
Turing completeness
In computability theory, a system of data-manipulation rules is said to be Turing complete or computationally universal if and only if it can be used to simulate any single-taped Turing machine and thus in principle any computer. A classic example is the lambda calculus...
) can be expressed in terms of recursive functions; iterative control constructs such as while loop
While loop
In most computer programming languages, a while loop is a control flow statement that allows code to be executed repeatedly based on a given boolean condition. The while loop can be thought of as a repeating if statement....
s and do loops routinely are rewritten in recursive form in functional languages. However, in practice this rewriting depends on tail call elimination, which is not a feature of all languages. C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
, Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
, and Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
are notable mainstream languages in which all function calls, including tail call
Tail call
In computer science, a tail call is a subroutine call that happens inside another procedure and that produces a return value, which is then immediately returned by the calling procedure. The call site is then said to be in tail position, i.e. at the end of the calling procedure. If a subroutine...
s, cause stack allocation that would not occur with the use of looping constructs; in these languages, a working iterative program rewritten in recursive form may overflow the call stack
Stack overflow
In software, a stack overflow occurs when too much memory is used on the call stack. The call stack contains a limited amount of memory, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture,...
.
Other considerations
In some programming languages, the stack space available to a thread is much less than the space available in the heap, and recursive algorithms tend to require more stack space than iterative algorithms. Consequently, these languages sometimes place a limit on the depth of recursion to avoid stack overflows. (PythonPython (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
is one such language.) Note the caveat below regarding the special case of tail recursion.
There are some types of problems whose solutions are inherently recursive, because of prior state they need to track. One example is tree traversal
Tree traversal
In computer science, tree-traversal refers to the process of visiting each node in a tree data structure, exactly once, in a systematic way. Such traversals are classified by the order in which the nodes are visited...
; others include the Ackermann function
Ackermann function
In computability theory, the Ackermann function, named after Wilhelm Ackermann, is one of the simplest and earliest-discovered examples of a total computable function that is not primitive recursive...
, depth-first search
Depth-first search
Depth-first search is an algorithm for traversing or searching a tree, tree structure, or graph. One starts at the root and explores as far as possible along each branch before backtracking....
, and divide-and-conquer algorithms such as Quicksort. All of these algorithms can be implemented iteratively with the help of an explicit stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...
, but the programmer effort involved in managing the stack, and the complexity of the resulting program, arguably outweigh any advantages of the iterative solution.
Tail-recursive functions
Tail-recursive functions are functions in which all recursive calls are tail callTail call
In computer science, a tail call is a subroutine call that happens inside another procedure and that produces a return value, which is then immediately returned by the calling procedure. The call site is then said to be in tail position, i.e. at the end of the calling procedure. If a subroutine...
s and hence do not build up any deferred operations. For example, the gcd function (shown again below) is tail-recursive. In contrast, the factorial function (also below) is not tail-recursive; because its recursive call is not in tail position, it builds up deferred multiplication operations that must be performed after the final recursive call completes. With a compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
or interpreter that treats tail-recursive calls as jumps
Goto
goto is a statement found in many computer programming languages. It is a combination of the English words go and to. It performs a one-way transfer of control to another line of code; in contrast a function call normally returns control...
rather than function calls, a tail-recursive function such as gcd will execute using constant space. Thus the program is essentially iterative, equivalent to using imperative language control structures like the "for" and "while" loops.
Tail recursion: | Augmenting recursion: |
---|---|
The significance of tail recursion is that when making a tail-recursive call, the caller's return position need not be saved on the call stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
; when the recursive call returns, it will branch directly on the previously saved return position. Therefore, on compilers that support tail-recursion optimization, tail recursion saves both space and time.
Order of execution
In a recursive function, the position in which additional statements (i.e., statements other than the recursive call itself) are placed is important. In the simple case of a function calling itself only once, a statement placed before the recursive call will be executed first in the outermost stack frame, while a statement placed after the recursive call will be executed first in the innermost stack frame. Consider this example:Function 1
Function 2 with swapped lines
Direct and indirect recursion
Most of the examples presented here demonstrate direct recursion, in which a function calls itself. Indirect recursion occurs when a function is called not by itself but by another function that it called (either directly or indirectly). "Chains" of three or more functions are possible; for example, function 1 calls function 2, function 2 calls function 3, and function 3 calls function 1 again.See also
- Ackermann functionAckermann functionIn computability theory, the Ackermann function, named after Wilhelm Ackermann, is one of the simplest and earliest-discovered examples of a total computable function that is not primitive recursive...
- Anonymous recursion
- CorecursionCorecursionIn computer science, corecursion is a type of operation that is dual to recursion. Corecursion and codata allow total languages to work with infinite data structures such as streams. Corecursion is often used in conjunction with lazy evaluation. Corecursion can produce both finite and infinite data...
- Functional programmingFunctional programmingIn computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
- Kleene–Rosser paradox
- McCarthy 91 functionMcCarthy 91 functionThe McCarthy 91 function is a recursive function, defined by computer scientist John McCarthy as a test case for formal verification within computer science.The McCarthy 91 function is defined as...
- MemoizationMemoizationIn computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...
- Mutual recursionMutual recursionMutual recursion is a form of recursion where two mathematical or computational functions are defined in terms of each other.For instance, consider two functions even? and odd? defined as follows: function even?...
- μ-recursive function
- Primitive recursive functionPrimitive recursive functionThe primitive recursive functions are defined using primitive recursion and composition as central operations and are a strict subset of the total µ-recursive functions...
- RecursionRecursionRecursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...
- Sierpiński curveSierpinski curveSierpiński curves are a recursively defined sequence of continuous closed plane fractal curves discovered by Wacław Sierpiński, which in the limit n \rightarrow \infty completely fill the unit square: thus their limit curve, also called the Sierpiński curve, is an example of a space-filling...
- Takeuchi function
External links
- Harold Abelson and Gerald Sussman: "Structure and Interpretation Of Computer Programs"
- IBM DeveloperWorks: "Mastering Recursive Programming"
- David S. Touretzky: "Common Lisp: A Gentle Introduction to Symbolic Computation"
- Matthias Felleisen: "How To Design Programs: An Introduction to Computing and Programming"
- Duke University: "Big-Oh for Recursive Functions: Recurrence Relations"