Lambda calculus - AbsoluteAstronomy.com

Mathematical logic

Mathematical logic is a subfield of mathematics with close connections to foundations of mathematics, theoretical computer science and philosophical logic. The field includes both the mathematical study of logic and the applications of formal logic to other areas of mathematics...

and computer science

Computer science

Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, lambda calculus, also written as λ-calculus, is a formal system

Formal system

In formal logic, a formal system consists of a formal language and a set of inference rules, used to derive an expression from one or more other premises that are antecedently supposed or derived . The axioms and rules may be called a deductive apparatus...

for function

Function (mathematics)

In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...

definition, function application and recursion

Recursion

Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion. The term has a variety of meanings specific to a variety of disciplines ranging from...

. The portion of lambda calculus relevant to computation is now called the untyped lambda calculus. In both typed lambda calculus
Typed lambda calculus
A typed lambda calculus is a typed formalism that uses the lambda-symbol to denote anonymous function abstraction. In this context, types are usually objects of a syntactic nature that are assigned to lambda terms; the exact nature of a type depends on the calculus considered...

and untyped versions, ideas from lambda calculus have found application in the fields of logic

Logic

In philosophy, Logic is the formal systematic study of the principles of valid inference and correct reasoning. Logic is used in most intellectual activities, but is studied primarily in the disciplines of philosophy, mathematics, semantics, and computer science...

, recursion theory (computability)

Computability theory

Computability theory, also called recursion theory, is a branch of mathematical logic that originated in the 1930s with the study of computable functions and Turing degrees. The field has grown to include the study of generalized computability and definability...

, and linguistics

Linguistics

Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....

, and have played an important role in the development of the theory of programming languages

Programming language theory

Programming language theory is a branch of computer science that deals with the design, implementation, analysis, characterization, and classification of programming languages and their individual features. It falls within the discipline of computer science, both depending on and affecting...

(with untyped lambda calculus being the original inspiration for functional programming

Functional programming

In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...

, in particular Lisp, and typed lambda calculi serving as the foundation for modern type system

Type system

A type system associates a type with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that no type errors can occur...

s). This article deals primarily with the untyped lambda calculus.

History

Lambda calculus was introduced by Alonzo Church

Alonzo Church

Alonzo Church was an American mathematician and logician who made major contributions to mathematical logic and the foundations of theoretical computer science. He is best known for the lambda calculus, Church–Turing thesis, Frege–Church ontology, and the Church–Rosser theorem.-Life:Alonzo Church...

in the 1930s as part of an investigation into the foundations of mathematics

Foundations of mathematics

Foundations of mathematics is a term sometimes used for certain fields of mathematics, such as mathematical logic, axiomatic set theory, proof theory, model theory, type theory and recursion theory...

. The original system was shown to be logically inconsistent

Consistency

Consistency can refer to:* Consistency , the psychological need to be consistent with prior acts and statements* "Consistency", an 1887 speech by Mark Twain...

in 1935 when Stephen Kleene and J. B. Rosser developed the Kleene–Rosser paradox.

Subsequently, in 1936 Church isolated and published just the portion relevant to computation, what is now called the untyped lambda calculus. In 1940, he also introduced a computationally weaker, but logically consistent system, known as the simply typed lambda calculus

Simply typed lambda calculus

The simply typed lambda calculus , a formof type theory, is a typed interpretation of the lambda calculus with only one type constructor: \to that builds function types. It is the canonical and simplest example of a typed lambda calculus...

Motivation

Functions

Function (mathematics)

are a fundamental concept within computer science

Computer science

and mathematics. The λ-calculus provides simple semantics for computation with functions so that properties of functional computation can be studied.

Consider the following two examples. The identity function

Identity function

In mathematics, an identity function, also called identity map or identity transformation, is a function that always returns the same value that was used as its argument...

I(x) = x

takes a single input, x, and immediately returns x (i.e. the identity does nothing with its input), whereas the function

sqsum(x, y) = x*x + y*y

takes a pair of inputs, x and y and returns the sum of their squares, x*x + y*y. Using these two examples, we can make some useful observations that motivate the major ideas in the lambda calculus.

The first observation is that functions need not be explicitly named. That is, the function

sqsum(x, y) = x*x + y*y

can be rewritten in anonymous form

Anonymous function

In programming language theory, an anonymous function is a function defined, and possibly called, without being bound to an identifier. Anonymous functions are convenient to pass as an argument to a higher-order function and are ubiquitous in languages with first-class functions such as Haskell...

(x, y) ↦ x*x + y*y

(read as “the pair of x and y is mapped to x*x + y*y”). Similarly,

I(x) = x

can be rewritten in anonymous form as x ↦ x, where the input is simply mapped to itself.

The second observation is that the specific choice of name for a function's arguments is largely irrelevant. That is,

x ↦ x and

y ↦ y

express the same function: the identity. Similarly,

(x, y) ↦ x*x + y*y and

(u, v) ↦ u*u + v*v

also express the same function.

Finally, any function that requires two inputs, for instance the before mentioned sqsum function, can be reworked into an equivalent function that accepts a single input, and as output returns another function, that in turn accepts a single input. For example,

(x, y) ↦ x*x + y*y

can be reworked into

x ↦ (y ↦ x*x + y*y)

This transformation is called currying

Currying

In mathematics and computer science, currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions each with a single argument...

, and can be generalized to functions accepting an arbitrary number of arguments.

Currying may be best grasped intuitively through the use of an example. Compare the function

(x, y) ↦ x*x + y*y

with its curried form,

x ↦ (y ↦ x*x + y*y)

Given two arguments, we have:

((x, y) ↦ x*x + y*y)(5, 2)

= 5*5 + 2*2 = 29

However, using currying, we have:

((x ↦ (y ↦ x*x + y*y))(5))(2)

= (y ↦ 5*5 + y*y)(2)

= 5*5 + 2*2 = 29

and we see the uncurried and curried forms compute the same result. Notice that x*x has become a constant.

The lambda calculus

The lambda calculus consists of lambda terms and some extra operations.

Since the names of functions are largely a convenience, the lambda calculus has no means of naming a function. Since all functions expecting more than one input can be transformed into equivalent functions accepting a single input (via Currying

Currying

), the lambda calculus has no means for creating a function that accepts more than one argument. Since the names of arguments are largely irrelevant, the native notion of equality on lambda terms is alpha-equivalence (see below), which codifies this principle.

Lambda terms

The syntax of lambda terms is particularly simple. There are three ways in which to obtain them:

a lambda term may be a variable, x;
if t is a lambda term, and x is a variable, then λx.t is a lambda term (called a lambda abstraction);
if t and s are lambda terms, then ts is a lambda term (called an application).

Nothing else is a lambda term, though bracketing may be used to disambiguate terms.

Intuitively, a lambda abstraction λx.t represents an anonymous function that takes a single input, and the λ is said to bind x in t, and an application ts represents the application of input s to some function t. In the lambda calculus, functions are taken to be first class values
First-class object
In programming language design, a first-class citizen , in the context of a particular programming language, is an entity that can be constructed at run-time, passed as a parameter, returned from a subroutine, or assigned into a variable...

, so functions may be used as the inputs to other functions, and functions may return functions as their outputs.

For example, λx.x represents the identity function, x ↦ x, and (λx.x)y represents the identity function applied to y. Further, (λx.y) represents the constant function x ↦ y, the function that always returns y, no matter the input. It should be noted that function application is left-associative, so (λx.x)y z = ((λx.x)y)z.

Lambda terms on their own aren't particularly interesting.
What makes them interesting are the various notions of equivalence and reduction that can be defined over them.

Alpha equivalence

A basic form of equivalence, definable on lambda terms, is alpha equivalence.
This states that the particular choice of bound variable, in a lambda abstraction, doesn't (usually) matter.
For instance, λx.x and λy.y are alpha-equivalent lambda terms, representing the same identity function.
Note that the terms x and y aren't alpha-equivalent, because they are not bound in a lambda abstraction.
In many presentations, it is usual to identify alpha-equivalent lambda terms.

The following definitions are necessary in order to be able to define beta reduction.

Free variables

The free variables of a term are those variables not bound by a lambda abstraction.
That is, the free variables of x are just x; the free variables of λx.t are the free variables of t, with x removed, and the free variables of ts are the union of the free variables of t and s.

For example, the lambda term representing the identity λx.x has no free variables, but the constant function λx.y has a single free variable, y.

Capture-avoiding substitutions

Using the definition of free variables, we may now define a capture-avoiding substitution.
Suppose t, s and r are lambda terms and x and y are variables.
We write t[x := r] for the substitution of r for x in t, in a capture-avoiding manner.
That is:

x[x := r] = r;
y[x := r] = y if x ≠ y;
(ts)[x := r] = (t[x := r])(s[x := r]);
(λx.t)[x := r] = λx.t;
(λy.t)[x := r] = λy.(t[x := r]) if x ≠ y and y is not in the free variables of r (sometimes said "y is fresh for r").

For example, (λx.x)[y := y] = λx.(x[y := y]) = λx.x, and ((λx.y)x)[x := y] = ((λx.y)[x := y])(x[x := y]) = (λx.y)y.

The freshness condition (requiring that y is not in the free variables of r) is crucial in order to ensure that substitution does not change the meaning of functions.
For example, suppose we define another substitution action without the freshness condition.
Then, (λx.y)[y := x] = λx.(y[y := x]) = λx.x, and the constant function λx.y turns into the identity λx.x by substitution.

If our freshness condition is not met, then we may simply alpha-rename with a suitably fresh variable.
For example, switching back to our correct notion of substitution, in (λx.y)[y := x] the lambda abstraction can be renamed with a fresh variable z, to obtain (λz.y)[y := x] = λz.(y[y := x]) = λz.x, and the meaning of the function is preserved by substitution.

Beta reduction

Beta reduction states that an application of the form (λx.t)s reduces to the term t[x := s] (we write (λx.t)s → t[x := s] as a convenient shorthand for “(λx.t)s beta reduces to t[x := s]”).
For example, for every s we have (λx.x)s → x[x := s] = s, demonstrating that λx.x really is the identity.
Similarly, (λx.y)s → y[x := s] = y, demonstrating that λx.y really is a constant function.

The lambda calculus may be seen as an idealised functional programming language, like Haskell

Haskell (programming language)

Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

or Standard ML

Standard ML

Standard ML is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.SML is a modern descendant of the ML...

.
Under this view, beta reduction corresponds to a computational step, and in the untyped lambda calculus, as presented here, reduction need not terminate.
For instance, consider the term (λx.xx)(λx.xx).
Here, we have (λx.xx)(λx.xx) → (xx)[x := λx.xx] = (x[x := λx.xx])(x[x := λx.xx]) = (λx.xx)(λx.xx).
That is, the term reduces to itself in a single beta reduction, and therefore reduction will never terminate.

Another problem with the untyped lambda calculus is the inability to distinguish between different kinds of data.
For instance, we may want to write a function that only operates on numbers.
However, in the untyped lambda calculus, there's no way to prevent our function from being applied to truth values, or strings, for instance.

Typed lambda calculi, which will be introduced later in the article, have the property that if a term is well-typed, then it never gets "stuck" (where there is no evaluation rule for the term), and that if a term e has a particular type, and e → e', then e' has the same type.

Definition

Lambda expressions are composed of

variables v₁, v₂, ..., v_n, ...

the abstraction symbols λ and .

parentheses

The set of lambda expressions, Λ, can be defined recursively

Recursive definition

In mathematical logic and computer science, a recursive definition is used to define an object in terms of itself ....

If x is a variable, then x ∈ Λ
If x is a variable and M ∈ Λ, then (λx.M) ∈ Λ
If M, N ∈ Λ, then (M N) ∈ Λ

Instances of rule 2 are known as abstractions and instances of rule 3 are known as applications.

Notation

To keep the notation of lambda expressions uncluttered, the following conventions are usually applied.

Outermost parentheses are dropped: M N instead of (M N).
Applications are assumed to be left associative: M N P may be written instead of ((M N) P).
The body of an abstraction extends as far right as possible: λx.M N means λx.(M N) and not (λx.M) N.
A sequence of abstractions is contracted: λx.λy.λz.N is abbreviated as λxyz.N.

Free and bound variables

The abstraction operator, λ, is said to bind its variable wherever it occurs in the body of the abstraction. Variables that fall within the scope of a lambda are said to be bound. All other variables are called free. For example in the following expression y is a bound variable and x is free: λy.x x y. Also note that a variable binds to its "nearest" lambda. In the following expression one single occurrence of x is bound by the second lambda: λx.y (λx.z x)

The set of free variables of a lambda expression, M, is denoted as FV(M) and is defined by recursion on the structure of the terms, as follows:

FV(x) = {x}, where x is a variable
FV(λx.M) = FV(M) \ {x}
FV(M N) = FV(M) ∪ FV(N)

An expression that contains no free variables is said to be closed. Closed lambda expressions are also known as combinators and are equivalent to terms in combinatory logic

Combinatory logic

Combinatory logic is a notation introduced by Moses Schönfinkel and Haskell Curry to eliminate the need for variables in mathematical logic. It has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming...

Reduction

The meaning of lambda expressions is defined by how expressions can be reduced.

There are three kinds of reduction:

α-conversion: changing bound variables;
β-reduction: applying functions to their arguments;
η-conversion: which captures a notion of extensionality.

We also speak of the resulting equivalences: two expressions are β-equivalent, if they can be β-converted into the same expression, and α/η-equivalence are defined similarly.

The term redex, short for reducible expression, refers to subterms that can be reduced by one of the reduction rules. For example, (λx.M) N is a beta-redex; if x is not free in M, λx.M x is an eta-redex. The expression to which a redex reduces is called its reduct; using the previous example, the reducts of these expressions are respectively M[x:=N] and M.

α-conversion

Alpha-conversion, sometimes known as alpha-renaming, allows bound variable names to be changed. For example, alpha-conversion of λx.x might yield λy.y. Terms that differ only by alpha-conversion are called α-equivalent. Frequently in uses of lambda calculus, α-equivalent terms are considered to be equivalent.

The precise rules for alpha-conversion are not completely trivial. First, when alpha-converting an abstraction, the only variable occurrences that are renamed are those that are bound to the same abstraction. For example, an alpha-conversion of λx.λx.x could result in λy.λx.x, but it could not result in λy.λx.y. The latter has a different meaning from the original.

Second, alpha-conversion is not possible if it would result in a variable getting captured by a different abstraction. For example, if we replace x with y in λx.λy.x, we get λy.λy.y, which is not at all the same.

In programming languages with static scope, alpha-conversion can be used to make name resolution

Name resolution

-In computer languages:Expressions in computer languages can contain identifiers. The semantics of such expressions depend on the entities that the identifiers refer to. The algorithm that determines what an identifier in a given context refers to is part of the language definition.The complexity...

simpler by ensuring that no variable name masks

Variable shadowing

In computer programming, variable shadowing occurs when a variable declared within a certain scope has the same name as a variable declared in an outer scope. This outer variable is said to be shadowed...

a name in a containing scope

Scope (programming)

In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes. The type of scope determines what kind of entities it can contain and how it affects them—or semantics...

(see alpha renaming to make name resolution trivial).

Substitution

Substitution, written E[V := E′], is the process of replacing all free occurrences of the variable V by expression E′.
Substitution on terms of the λ-calculus is defined by recursion on the structure of terms, as follows.

x[x := N] ≡ N

y[x := N] ≡ y, if x ≠ y

(M₁ M₂)[x := N] ≡ (M₁[x := N]) (M₂[x := N])

(λx.M)[x := N] ≡ λx.(M)

(λy.M)[x := N] ≡ λy.(M[x := N]), if x ≠ y, provided y ∉ FV(N)

To substitute into a lambda abstraction, it is sometimes necessary to α-convert the expression. For example, it is not correct for (λx.y)[y := x] to result in (λx.x), because the substituted x was supposed to be free but ended up being bound. The correct substitution in this case is (λz.x), up to α-equivalence. Notice that substitution is defined uniquely up to α-equivalence.

β-reduction

Beta-reduction captures the idea of function application. Beta-reduction is defined in terms of substitution: the beta-reduction of ((λV.E) E′) is E[V := E′].

For example, assuming some encoding of 2, 7, *, we have the following β-reductions: ((λn.n*2) 7) → 7*2.

η-conversion

Eta-conversion expresses the idea of extensionality

Extensionality

In logic, extensionality, or extensional equality refers to principles that judge objects to be equal if they have the same external properties...

, which in this context is that two functions are the same if and only if

If and only if

In logic and related fields such as mathematics and philosophy, if and only if is a biconditional logical connective between statements....

they give the same result for all arguments. Eta-conversion converts between λx.f x and f whenever x does not appear free in f.

Normal forms and confluence

For the untyped lambda calculus, β-reduction as a rewriting rule is neither strongly normalising nor weakly normalising.

However, it can be shown that β-reduction is confluent

Confluence (abstract rewriting)

In computer science, confluence is a property of rewriting systems, describing that terms in this system can be rewritten in more than one way, to yield the same result. This article describes the properties in the most abstract setting of an abstract rewriting system.- Motivating example :Consider...

. (Of course, we are working up to α-conversion, i.e. we consider two normal forms to be equal, if it is possible to α-convert one into the other.)

Therefore, both strongly normalising terms and weakly normalising terms have a unique normal form. For strongly normalising terms, any reduction strategy is guaranteed to yield the normal form, whereas for weakly normalising terms, some reduction strategies may fail to find it.

Encoding datatypes

The basic lambda calculus may be used to model booleans, arithmetic

Arithmetic

Arithmetic or arithmetics is the oldest and most elementary branch of mathematics, used by almost everyone, for tasks ranging from simple day-to-day counting to advanced science and business calculations. It involves the study of quantity, especially as the result of combining numbers...

, data structures and recursion, as illustrated in the following sub-sections.

Arithmetic in lambda calculus

There are several possible ways to define the natural number

Natural number

In mathematics, the natural numbers are the ordinary whole numbers used for counting and ordering . These purposes are related to the linguistic notions of cardinal and ordinal numbers, respectively...

s in lambda calculus, but by far the most common are the Church numerals, which can be defined as follows:

0 := λf.λx.x

1 := λf.λx.f x

2 := λf.λx.f (f x)

3 := λf.λx.f (f (f x))

and so on. Or using the alternate syntax presented above in Notation:

0 := λfx.x

1 := λfx.f x

2 := λfx.f (f x)

3 := λfx.f (f (f x))

A Church numeral is a higher-order function

Higher-order function

In mathematics and computer science, higher-order functions, functional forms, or functionals are functions which do at least one of the following:*take one or more functions as an input*output a function...

—it takes a single-argument function f, and returns another single-argument function. The Church numeral n is a function that takes a function f as argument and returns the n-th composition of f, i.e. the function f composed with itself n times. This is denoted f⁽ⁿ⁾ and is in fact the n-th power of f (considered as an operator); f⁽⁰⁾ is defined to be the identity function. Such repeated compositions (of a single function f) obey the laws of exponents, which is why these numerals can be used for arithmetic. (In Church's original lambda calculus, the formal parameter of a lambda expression was required to occur at least once in the function body, which made the above definition of 0 impossible.)

We can define a successor function, which takes a number n and returns n + 1 by adding an additional application of f:

SUCC := λnfx.f (n f x)

Because the m-th composition of f composed with the n-th composition of f gives the m+n-th composition of f, addition can be defined as follows:

PLUS := λmnfx.m f (n f x)

PLUS can be thought of as a function taking two natural numbers as arguments and returning a natural number; it can be verified that

PLUS 2 3

and

5

are equivalent lambda expressions. Since adding m to a number n can be accomplished by adding 1 m times, an equivalent definition is:

PLUS := λmn.m SUCC n

Similarly, multiplication can be defined as

MULT := λmnf.m (n f)

Alternatively

MULT := λmn.m (PLUS n) 0

since multiplying m and n is the same as repeating the add n function m times and then applying it to zero.
Exponentiation has a rather simple rendering in Church numerals, namely

POW := λbe.e b

The predecessor function defined by PRED n = n - 1 for a positive integer n and PRED 0 = 0 is considerably more difficult. The formula

PRED := λnfx.n (λgh.h (g f)) (λu.x) (λu.u)

can be validated by showing inductively that if T denotes (λgh.h (g f)), then T⁽ⁿ⁾(λu.x) = (λh.h(f^(n-1)(x))) for n > 0. Two other definitions of PRED are given below, one using conditionals and the other using pairs. With the predecessor function, subtraction is straightforward. Defining

SUB := λmn.n PRED m,

SUB m n yields m - n when m > n and 0 otherwise.

Logic and predicates

By convention, the following two definitions (known as Church booleans) are used for the boolean values TRUE and FALSE:

TRUE := λxy.x

FALSE := λxy.y

(Note that FALSE is equivalent to the Church numeral zero defined above)

Then, with these two λ-terms, we can define some logic operators (these are just possible formulations; other expressions are equally correct):

AND := λpq.p q p

OR := λpq.p p q

NOT := λpab.p b a

IFTHENELSE := λpab.p a b

We are now able to compute some logic functions, for example:

AND TRUE FALSE

≡ (λpq.p q p) TRUE FALSE →_β TRUE FALSE TRUE

≡ (λxy.x) FALSE TRUE →_β FALSE

and we see that AND TRUE FALSE is equivalent to FALSE.

A predicate is a function that returns a boolean value. The most fundamental predicate is ISZERO, which returns TRUE if its argument is the Church numeral 0, and FALSE if its argument is any other Church numeral:

ISZERO := λn.n (λx.FALSE) TRUE

The following predicate tests whether the first argument is less-than-or-equal-to the second:

LEQ := λmn.ISZERO (SUB m n),

and since m = n, if LEQ m n and LEQ n m, it is straightforward to build a predicate for numerical equality.

The availability of predicates and the above definition of TRUE and FALSE make it convenient to write "if-then-else" expressions in lambda calculus. For example, the predecessor function can be defined as:

PRED := λn.n (λgk.ISZERO (g 1) k (PLUS (g k) 1)) (λv.0) 0

which can be verified by showing inductively that n (λgk.ISZERO (g 1) k (PLUS (g k) 1)) (λv.0) is the add n - 1 function for n > 0.

Pairs

A pair (2-tuple) can be defined in terms of TRUE and FALSE, by using the Church encoding for pairs. For example, PAIR encapsulates the pair (x,y), FIRST returns the first element of the pair, and SECOND returns the second.

PAIR := λxyf.f x y

FIRST := λp.p TRUE

SECOND := λp.p FALSE

NIL := λx.TRUE

NULL := λp.p (λxy.FALSE)

A linked list can be defined as either NIL for the empty list, or the PAIR of an element and a smaller list. The predicate NULL tests for the value NIL. (Alternatively, with NIL := FALSE, the construct l (λhtz.deal_with_head_h_and_tail_t) (deal_with_nil) obviates the need for an explicit NULL test).

As an example of the use of pairs, the shift-and-increment function that maps (m, n) to (n, n + 1) can be defined as

Φ := λx.PAIR (SECOND x) (SUCC (SECOND x))

which allows us to give perhaps the most transparent version of the predecessor function:

PRED := λn.FIRST (n Φ (PAIR 0 0)).

Recursion and fixed points

Recursion

is the definition of a function using the function itself; on the face of it, lambda calculus does not allow this (we can't refer to a value which is yet to be defined, inside the lambda term defining that same value, as all functions are anonymous

Anonymous function

in lambda calculus). However, this impression is misleading: in (λx.x x) y both x‍ ‍'s refer to the same lambda term, y, so it is possible for a lambda expression – here y – to be arranged to receive itself as its argument value, through self-application.

Consider for instance the factorial

Factorial

In mathematics, the factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n...

function f(n) recursively defined by

f(n) = 1, if n = 0; else n × f(n - 1).

In lambda expression which is to represent this function, a parameter will be assumed to receive the lambda expression itself as its value. This argument is typically the first. Binding it to the function yields a new function that may now recurse (referring to itself through this argument). To achieve recursion, the intended-as-self-referencing argument (called r here) must always be passed to itself within the function body, at a call point:

g := λr. λn.(1, if n = 0; else n × (r r (n-1)))

with r r x = f x = g r x to hold, so r = g and

f := g g = (λx.x x) g

The self-application achieves replication here, passing the function's lambda expression on to the next invocation as an argument value, making it available to be referenced and called there.

This solves the specific problem of the factorial function, but we'd like to have a generic solution, i.e. not forcing a specific re-write for every recursive function:

g := λr. λn.(1, if n = 0; else n × (r (n-1)))

with r x = f x = g r x to hold, so r = g r =: fix g and

f := fix g where fix g := (r where r = g r) = g (fix g)

Given a lambda term representing the body of a recursive function or loop, with the first argument representing a recursive call, the fixed-point operator will return the desired recursive function or loop. The function does not need to be explicitly passed to itself at any point, as this is done in advance, when the fix-point combinator is applied, creating a self-replicating self-referencing lambda expression.

In fact, there are many possible definitions for this operator, considered as a type of fixed-point combinators. The simplest is defined as:

Y = λg.(λx.g (x x)) (λx.g (x x))

In the lambda calculus, Y g is a fixed-point of g, as it expands to:

Y g

λh.((λx.h (x x)) (λx.h (x x))) g

(λx.g (x x)) (λx.g (x x))

g ((λx.g (x x)) (λx.g (x x)))

g (Y g).

Now, to complete our recursive call to the factorial function, we would simply call (Y g) n, where n is the number we are calculating the factorial of. Given n = 4, for example, this gives:

(Y g) 4

g (Y g) 4

(λrn.(1, if n = 0; else n × (r (n-1)))) (Y g) 4

(λn.(1, if n = 0; else n × ((Y g) (n-1)))) 4

1, if 4 = 0; else 4 × ((Y g) (4-1))

4 × (g (Y g) (4-1))

4 × ((λn.(1, if n = 0; else n × ((Y g) (n-1)))) (4-1))

4 × (1, if 3 = 0; else 3 × ((Y g) (3-1)))

4 × (3 × (g (Y g) (3-1)))

4 × (3 × ((λn.(1, if n = 0; else n × ((Y g) (n-1)))) (3-1)))

4 × (3 × (1, if 2 = 0; else 2 × ((Y g) (2-1))))

4 × (3 × (2 × (g (Y g) (2-1))))

4 × (3 × (2 × ((λn.(1, if n = 0; else n × ((Y g) (n-1)))) (2-1))))

4 × (3 × (2 × (1, if 1 = 0; else 1 × ((Y g) (1-1)))))

4 × (3 × (2 × (1 × (g (Y g) (1-1)))))

4 × (3 × (2 × (1 × ((λn.(1, if n = 0; else n × ((Y g) (n-1)))) (1-1)))))

4 × (3 × (2 × (1 × (1, if 0 = 0; else 0 × ((Y g) (0-1))))))

4 × (3 × (2 × (1 × (1))))

24

Every recursively defined function can be seen as a fixed point of some other suitable function, and therefore, using Y, every recursively defined function can be expressed as a lambda expression. In particular, we can now cleanly define the subtraction, multiplication and comparison predicate of natural numbers recursively.

Standard terms

Certain terms have commonly accepted names:

I := λx.x

K := λxy.x

S := λxyz.x z (y z)

ω := λx.x x

Ω := ω ω

Y := λg.(λx.g (x x)) (λx.g (x x))

Computable functions and lambda calculus

A function F: N → N of natural number

Natural number

s is a computable function

Computable function

Computable functions are the basic objects of study in computability theory. Computable functions are the formalized analogue of the intuitive notion of algorithm. They are used to discuss computability without referring to any concrete model of computation such as Turing machines or register...

if and only if

If and only if

In logic and related fields such as mathematics and philosophy, if and only if is a biconditional logical connective between statements....

there exists a lambda expression f such that for every pair of x, y in N, F(x)=y if and only if f x =_β y, where x and y are the Church numerals corresponding to x and y, respectively and =_β meaning equivalence with beta reduction. This is one of the many ways to define computability; see the Church-Turing thesis for a discussion of other approaches and their equivalence.

Undecidability of equivalence

There is no algorithm that takes as input two lambda expressions and outputs TRUE or FALSE depending on whether or not the two expressions are equivalent. This was historically the first problem for which undecidability could be proven. As is common for a proof of undecidability, the proof shows that no computable function

Computable function

can decide the equivalence. Church's thesis is then invoked to show that no algorithm can do so.

Church's proof first reduces the problem to determining whether a given lambda expression has a normal form. A normal form is an equivalent expression that cannot be reduced any further under the rules imposed by the form. Then he assumes that this predicate is computable, and can hence be expressed in lambda calculus. Building on earlier work by Kleene and constructing a Gödel numbering for lambda expressions, he constructs a lambda expression e that closely follows the proof of Gödel's first incompleteness theorem

Gödel's incompleteness theorems

Gödel's incompleteness theorems are two theorems of mathematical logic that establish inherent limitations of all but the most trivial axiomatic systems capable of doing arithmetic. The theorems, proven by Kurt Gödel in 1931, are important both in mathematical logic and in the philosophy of...

. If e is applied to its own Gödel number, a contradiction results.

Lambda calculus and programming languages

As pointed out by Peter Landin's 1965 paper A Correspondence between ALGOL 60 and Church's Lambda-notation, sequential procedural programming languages

Procedural programming

Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm, derived from structured programming, based upon the concept of the procedure call...

can be understood in terms of the lambda calculus, which provides the basic mechanisms for procedural abstraction and procedure (subprogram) application.

Lambda calculus reifies

Reification (computer science)

Reification is the process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language. A computable/addressable object — a resource — is created in a system as a proxy for a non computable/addressable object...

"functions" and makes them first-class object

First-class object

In programming language design, a first-class citizen , in the context of a particular programming language, is an entity that can be constructed at run-time, passed as a parameter, returned from a subroutine, or assigned into a variable...

s, which raises implementation complexity when it is implemented. A particular challenge is related to the support of higher-order functions, also known as the Funarg problem

Funarg problem

In computer science, the funarg problem refers to the difficulty in implementing first-class functions in stack-based programming language implementations....

. Lambda calculus is usually implemented using a virtual machine

Virtual machine

A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

approach. The first practical implementation of lambda calculus was provided in 1963 by Peter Landin, and is known as the SECD machine

SECD machine

The SECD machine is a highly influential virtual machine and abstract machine intended as a target for functional programming language compilers. The letters stand for Stack, Environment, Code, Dump, the internal registers of the machine...

. Since then, several optimized abstract machines for lambda calculus were suggested, such as the G-machine and the categorical abstract machine

Categorical abstract machine

The categorical abstract machine is a model of computation for programs that preserves the abilities of applicative, functional, or compositional style. It is based on the techniques of applicative computing.- Overview :...

.

The most prominent counterparts to lambda calculus in programming are functional programming languages, which essentially implement the calculus augmented with some constants

Constant (programming)

In computer programming, a constant is an identifier whose associated value cannot typically be altered by the program during its execution...

and datatypes. Lisp

Lisp programming language

Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older...

uses a variant of lambda notation for defining functions, but only its purely functional subset ("Pure Lisp

Lispkit Lisp

Lispkit Lisp is a lexically scoped, purely functional subset of Lisp developed as a testbed for functional programming concepts. It was first used for early experimentation with lazy evaluation. An SECD machine-based implementation written in an ALGOL variant was published by the developer Peter...

") is really equivalent to lambda calculus.

First-class functions

Anonymous functions are sometimes called lambda expressions in programming languages. An example of the 'square' function as a lambda expression in Lisp:

(lambda (x) (* x x))

or Haskell

Haskell (programming language)

:

\x -> x*x -- where the \ denotes the greek λ

The above Lisp example is an expression that evaluates to a first-class function. The symbol lambda creates an anonymous function, given a list of parameter names, (x) — just a single argument in this case, and an expression that is evaluated as the body of the function, (* x x). The Haskell example is identical.

Numerous imperative languages

Imperative programming

In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...

, e.g. Pascal

Pascal (programming language)

Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

, have long supported passing subprograms as arguments to other subprograms through the mechanism of function pointers. However, possession of function pointers in and of itself does not mean that functions are first class; a datatype is first class when new instances can be created at run time. Several imperative object-oriented languages do support run time creation of functions values; these include C++

C++

C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, Smalltalk

Smalltalk

Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

and more recently in Eiffel

Eiffel (programming language)

Eiffel is an ISO-standardized, object-oriented programming language designed by Bertrand Meyer and Eiffel Software. The design of the language is closely connected with the Eiffel programming method...

("agents") and C# ("delegates"). As an example, the Eiffel "inline agent" expression

agent (x: REAL): REAL do Result := x * x end

denotes an object corresponding to the lambda expression λx.x*x (with call by value). It can be treated like any other expression, e.g. assigned to a variable or passed around to routines. If the value of square is the above agent expression, then the result of applying square to a value a (β-reduction) is expressed as square.item ([a]), where the argument is passed as a tuple

Tuple

In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...

.

A Python

Python (programming language)

Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

example of this uses the lambda form of functions:

func = lambda x: x ** 2

This creates a new anonymous function and names it func that can be passed to other functions, stored in variables, etc. Python can also treat any other function created with the standard def statement as first-class object

First-class object

s.

The same holds for Smalltalk expression

[ :x | x * x ]

This is first-class object (block closure), which can be stored in variables, passed as arguments, etc.

A similar C++ example (using the Boost.Lambda library):

std::for_each(c.begin, c.end, std::cout << _1 * _1 << '\n');

Here the standard library function for_each iterates over all members of container 'c', and prints the square of each element. The _1 notation is Boost.Lambda's convention (originally derived from Boost.Bind) for representing the first placeholder element (the first argument), represented as x elsewhere. An example using the C++11 anonymous function, assuming 'c' contains 'int's, would be:

std::for_each(c.begin, c.end, [](int i) {std::cout << i * i << '\n'});

In JavaScript

JavaScript

JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

since version 1.8, the notation:

function(x) x*x;

is used.

Reduction strategies

Whether a term is normalising or not, and how much work needs to be done in normalising it if it is, depends to a large extent on the reduction strategy used. The distinction between reduction strategies relates to the distinction in functional programming languages between eager evaluation

Eager evaluation

In computer programming, eager evaluation or greedy evaluation is the evaluation strategy in most traditional programming languages. In eager evaluation an expression is evaluated as soon as it gets bound to a variable. The term is typically used to contrast lazy evaluation, where expressions are...

and lazy evaluation

Lazy evaluation

In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...

.

Full beta reductions: Any redex can be reduced at any time. This means essentially the lack of any particular reduction strategy—with regard to reducibility, "all bets are off".
Applicative order: The rightmost, innermost redex is always reduced first. Intuitively this means a function's arguments are always reduced before the function itself. Applicative order always attempts to apply functions to normal forms, even when this is not possible.

Most programming languages (including Lisp, ML and imperative languages like C

C (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

and Java) are described as "strict", meaning that functions applied to non-normalising arguments are non-normalising. This is done essentially using applicative order, call by value reduction (see below), but usually called "eager evaluation".

Normal order: The leftmost, outermost redex is always reduced first. That is, whenever possible the arguments are substituted into the body of an abstraction before the arguments are reduced.
Call by name: As normal order, but no reductions are performed inside abstractions. For example λx.(λx.x)x is in normal form according to this strategy, although it contains the redex (λx.x)x.
Call by value: Only the outermost redexes are reduced: a redex is reduced only when its right hand side has reduced to a value (variable or lambda abstraction).
Call by need: As normal order, but function applications that would duplicate terms instead name the argument, which is then reduced only "when it is needed". Called in practical contexts "lazy evaluation". In implementations this "name" takes the form of a pointer, with the redex represented by a thunk.

Applicative order is not a normalising strategy. The usual counterexample is as follows: define Ω = ωω where ω = λx.xx. This entire expression contains only one redex, namely the whole expression; its reduct is again Ω. Since this is the only available reduction, Ω has no normal form (under any evaluation strategy). Using applicative order, the expression KIΩ = (λxy.x) (λx.x)Ω is reduced by first reducing Ω to normal form (since it is the rightmost redex), but since Ω has no normal form, applicative order fails to find a normal form for KIΩ.

In contrast, normal order is so called because it always finds a normalising reduction, if one exists. In the above example, KIΩ reduces under normal order to I, a normal form. A drawback is that redexes in the arguments may be copied, resulting in duplicated computation (for example, (λx.xx) ((λx.x)y) reduces to ((λx.x)y) ((λx.x)y) using this strategy; now there are two redexes, so full evaluation needs two more steps, but if the argument had been reduced first, there would now be none).

The positive tradeoff of using applicative order is that it does not cause unnecessary computation, if all arguments are used, because it never substitutes arguments containing redexes and hence never needs to copy them (which would duplicate work). In the above example, in applicative order (λx.xx) ((λx.x)y) reduces first to (λx.xx)y and then to the normal order yy, taking two steps instead of three.

Most purely functional programming languages (notably Miranda and its descendents, including Haskell

Haskell (programming language)

), and the proof languages of theorem provers, use lazy evaluation

Lazy evaluation

, which is essentially the same as call by need. This is like normal order reduction, but call by need manages to avoid the duplication of work inherent in normal order reduction using sharing. In the example given above, (λx.xx) ((λx.x)y) reduces to ((λx.x)y) ((λx.x)y), which has two redexes, but in call by need they are represented using the same object rather than copied, so when one is reduced the other is too.

A note about complexity

While the idea of beta reduction seems simple enough, it is not an atomic step, in that it must have a non-trivial cost when estimating computational complexity

Computational Complexity

Computational Complexity may refer to:*Computational complexity theory*Computational Complexity...

. To be precise, one must somehow find the location of all of the occurrences of the bound variable V in the expression E, implying a time cost, or one must keep track of these locations in some way, implying a space cost. A naïve search for the locations of V in E is O(n)

Big O notation

In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

in the length n of E. This has led to the study of systems that use explicit substitution

Explicit substitution

In computer science, lambda calculi are said to have explicit substitutions if they pay special attention to the formalization of the process of substitution. This is in contrast to the standard lambda calculus where substitutions are performed by beta reductions in an implicit manner which is not...

. Sinot's director string

Director string

In mathematics, in the area of lambda calculus and computation, directors or director strings are a mechanism for keeping track of the free variables in a term...

s offer a way of tracking the locations of free variables in expressions.

Parallelism and concurrency

The Church-Rosser property of the lambda calculus means that evaluation (β-reduction) can be carried out in any order, even in parallel. This means that various nondeterministic evaluation strategies are relevant. However, the lambda calculus does not offer any explicit constructs for parallelism

Parallel computing

Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

. One can add constructs such as Futures

Futures and promises

In computer science, future, promise, and delay refer to constructs used for synchronization in some concurrent programming languages. They describe an object that acts as a proxy for a result that is initially not known, usually because the computation of its value has not yet completed.The term...

to the lambda calculus. Other process calculi have been developed for describing communication and concurrency.

Semantics

The fact that lambda calculus terms act as functions on other lambda calculus terms, and even on themselves, led to questions about the semantics of the lambda calculus. Could a sensible meaning be assigned to lambda calculus terms? The natural semantics was to find a set D isomorphic to the function space D → D, of functions on itself. However, no nontrivial such D can exist, by cardinality constraints because the set of all functions from D into D has greater cardinality than D.

In the 1970s, Dana Scott

Dana Scott

Dana Stewart Scott is the emeritus Hillman University Professor of Computer Science, Philosophy, and Mathematical Logic at Carnegie Mellon University; he is now retired and lives in Berkeley, California...

showed that, if only continuous functions

Scott continuity

In mathematics, given two partially ordered sets P and Q a function f : P \rightarrow Q between them is Scott-continuous if it preserves all directed suprema, i.e...

were considered, a set or domain

Domain theory

Domain theory is a branch of mathematics that studies special kinds of partially ordered sets commonly called domains. Consequently, domain theory can be considered as a branch of order theory. The field has major applications in computer science, where it is used to specify denotational...

D with the required property could be found, thus providing a model

Model theory

In mathematics, model theory is the study of mathematical structures using tools from mathematical logic....

for the lambda calculus.

This work also formed the basis for the denotational semantics

Denotational semantics

In computer science, denotational semantics is an approach to formalizing the meanings of programming languages by constructing mathematical objects which describe the meanings of expressions from the languages...

of programming languages.

External links

Achim Jung, A Short Introduction to the Lambda Calculus-(PDF
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....

)
David C. Keenan, To Dissect a Mockingbird: A Graphical Notation for the Lambda Calculus with Animated Reduction
Raúl Rojas, A Tutorial Introduction to the Lambda Calculus-(PDF
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....

)
Peter Selinger, Lecture Notes on the Lambda Calculus-(PDF
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....

)
L. Allison, Some executable λ-calculus examples
Georg P. Loczewski, The Lambda Calculus and A++
Bret Victor, Alligator Eggs: A Puzzle Game Based on Lambda Calculus
Lambda Calculus on Safalra’s Website
LCI Lambda Interpreter a simple yet powerful pure calculus interpreter
Lambda Calculus links on Lambda-the-Ultimate
Mike Thyer, Lambda Animator, a graphical Java applet demonstrating alternative reduction strategies.
An Introduction to Lambda Calculus and Scheme, by Jim Larson

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

History

Motivation

The lambda calculus

Lambda terms

Alpha equivalence

Free variables

Capture-avoiding substitutions

Beta reduction

Definition

Notation

Free and bound variables

Reduction

α-conversion

Substitution

β-reduction

η-conversion

Normal forms and confluence

Encoding datatypes

Arithmetic in lambda calculus

Logic and predicates

Pairs

Recursion and fixed points

Standard terms

Computable functions and lambda calculus

Undecidability of equivalence

Lambda calculus and programming languages

First-class functions

Reduction strategies

A note about complexity

Parallelism and concurrency

Semantics

Further reading

External links