Binary lambda calculus
Encyclopedia
Binary lambda calculus (BLC) is a technique for using the lambda calculus
Lambda calculus
In mathematical logic and computer science, lambda calculus, also written as λ-calculus, is a formal system for function definition, function application and recursion. The portion of lambda calculus relevant to computation is now called the untyped lambda calculus...

 to study Kolmogorov complexity
Kolmogorov complexity
In algorithmic information theory , the Kolmogorov complexity of an object, such as a piece of text, is a measure of the computational resources needed to specify the object...

, by working with a standard binary encoding of lambda terms, and a designated universal machine. Binary lambda calculus is a new idea introduced by John Tromp in 2008.

Background

BLC is designed to provide a very simple and elegant concrete definition of descriptional complexity (Kolmogorov complexity).

Roughly speaking, the complexity of an object is the length of its shortest description.

To make this precise, we take descriptions to be bitstrings, and identify a description
method with some computational device, or machine, that transforms descriptions into
objects. Objects are usually also just bitstrings, but can have additional structure as well,
e.g., pairs of strings.

Originally, Turing machine
Turing machine
A Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm, and is particularly useful in explaining the functions of a CPU inside a...

s, the most well known formalism for computation,
were used for this purpose. But they are somewhat lacking in ease of
construction and composability. Another classical computational formalism,
the Lambda calculus
Lambda calculus
In mathematical logic and computer science, lambda calculus, also written as λ-calculus, is a formal system for function definition, function application and recursion. The portion of lambda calculus relevant to computation is now called the untyped lambda calculus...

, offers distinct advantages in ease of use.
The lambda calculus doesn't incorporate any notion of I/O though,
so one needs to be designed.

Binary I/O

Adding a readbit primitive function to lambda calculus, as Chaitin did for LISP
Lisp
A lisp is a speech impediment, historically also known as sigmatism. Stereotypically, people with a lisp are unable to pronounce sibilants , and replace them with interdentals , though there are actually several kinds of lisp...

,
would destroy its referential transparency
Referential transparency
Referential transparency and referential opaqueness are properties of parts of computer programs. An expression is said to be referentially transparent if it can be replaced with its value without changing the behavior of a program...

,
unless one distinguishes between an I/O action and its result, as Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

 does
with its monadic I/O. But that requires a type system, an unnecessary complication.

Instead, BLC requires translating bitstrings into lambda terms, to which the machine
(itself a lambda term) can be readily applied.

Bits 0 and 1 are translated into the standard lambda booleans
B0 = True and B1 = False:
True =
False =

which can be seen to directly implement the if-then-else operator.

Now consider the pairing function
applied to two terms M and N.
Applying this to a boolean yields the desired component of choice.

A string s = b0b1bn−1 is represented by repeated pairing as which is denoted as .

The z appearing in the above expression requires some further explanation.

Delimited versus undelimited

Descriptional complexity actually comes in two distinct flavors,
depending on whether the input is considered to be delimited.

Knowing the end of your input makes it easier to describe objects.
For instance, you can just copy the whole input to output.
This flavor is called plain or simple complexity.

But in a sense it is additional information. A file system for instance needs to separately
store the length of files. The C language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....


uses the null character
Null character
The null character , abbreviated NUL, is a control character with the value zero.It is present in many character sets, including ISO/IEC 646 , the C0 control code, the Universal Character Set , and EBCDIC...

 to denote the end of a string,
but this comes at the cost of not having that character available within strings.

The other flavor is called prefix complexity, named after prefix code
Prefix code
A prefix code is a type of code system distinguished by its possession of the "prefix property"; which states that there is no valid code word in the system that is a prefix of any other valid code word in the set...

s,
where the machine needs to figure out, from the input read so far, whether it needs
to read more bits. We say that the input is self-delimiting.
This works better for communication channels, since one can send
multiple descriptions, one after the other, and still tell them apart.

In the I/O model of BLC, the flavor is dictated by the choice of z.
If we keep it as a free variable, and demand that it appears as part of the output,
then the machine must be working in a self-delimiting manner.
If on the other hand we use a lambda term specifically designed to be easy
to distinguish from any pairing, then the input becomes delimited.
BLC chooses False for this purpose but gives it the more descriptive alternative name of Nil.
Dealing with lists that may be Nil is straightforward: since, and
one can write functions M and N to deal with the two cases, the only caveat being that
N will be passed to M as its third argument.

Universality

We can find a description method U such that for any other description method D,
there is a constant c (depending only on D) such that no object takes more than c extra
bits to describe with method U than with method D.
And BLC is designed to make these constants relatively small.
In fact the constant will be the length of a binary encoding of a D-interpreter written in BLC,
and U will be a lambda term that parses this encoding and runs this decoded interpreter on the
rest of the input. U won't even have to know whether the description are delimited or not;
it works the same either way.

Having already solved the problem of translating bitstring into lambda calculus, we now
face the opposite problem: how to encode lambda terms into bitstrings?

Lambda encoding

First we need to write our lambda terms in a particular notation using what is known as
De Bruijn indices. Our encoding is then defined recursively as follows

For instance, the pairing function
is written in De Bruijn format, which has encoding .

A closed lambda term is one in which all variables are bound, i.e. without any free variables.
In De Bruijn format, this means that an index i can only appear
within at least i nested lambdas.
The number of closed terms of size n bits is given by sequence
A114852
of the On-Line Encyclopedia of Integer Sequences.

The shortest possible closed term is the identity function .
In delimited mode, this machine just copies its input to its output.

So, what is this universal machine U?

Here it is, in De Bruijn format (all indices are single digit):

This is in binary:

01010001101000010000000110000001100001011110011111110000101

11001111111000000111100001011011011100111111110000111111110

00010111101001110100101100111111000011011000010111111110000

11111111000011100110111101110011010000110010001101000011010

A detailed analysis of machine U may be found in.

Complexity, concretely

In general, we can make complexity of an object conditional on several other objects
that are provided as additional argument to the universal machine.
Plain (or simple) complexity KS and prefix complexity KP are defined by

Theorems, concretely

The identity program proves that

The program proves that

The program

proves that
where is the Levenshtein code
Levenshtein coding
Levenstein coding, or Levenshtein coding, is a universal code encoding the non-negative integers developed by Vladimir Levenshtein.The code of zero is "0"; to code a positive number:#Initialize the step count variable C to 1....

 code for x defined by
in which we identify numbers and bitstrings according to lexicographic order.
This code has the nice property that for all k,
Furthermore, it makes lexicographic order of delimited numbers coincide with numeric order.
Number String Delimited
0 0
1 0 10
2 1 110 0
3 00 110 1
4 01 1110 0 00
5 10 1110 0 01
6 11 1110 0 10
7 000 1110 0 11
8 001 1110 1 000
9 010 1110 1 001

Complexity of Sets

The complexity of a set of natural numbers is the complexity of its characteristic sequence,
an infinite lambda term in the Infinitary Lambda Calculus.

The program

whose first 100 bits of output are


proves that (a prime)
while a simple variation proves

This is even shorter than Haskell's 23 byte long


nubBy(((>1).).gcd)[2..]

Symmetry of information

The program

proves that
where is a shortest program for x.

This inequality is the easy half of an equality (up to a constant) known as Symmetry of information.
Proving the converse
involves simulating infinitely many programs in dovetailing
Dovetailing (computer science)
Dovetailing in algorithm design, is a technique that interleaves different computations, performing them essentially simultaneously. Algorithms that use dovetailing are sometimes referred to as dovetailers....

 fashion,
seeing which ones halt and output the pair of x (for which a shortest program is given) and any y,
and for each such program p, requesting a new codeword for y of length .
The Kraft inequality ensures that this infinite enumeration of requests can be fulfilled,
and in fact codewords for y can be decoded on the fly, in tandem with the above enumeration.
Details of this fundamental result by Chaitin can be found in.

A quine

The term concatenates two copies of its input, proving that
Applying it to its own encoding gives a 132 bit quine:

Compression

So far, we've seen surprisingly little in the way of actually compressing an object into a shorter description; in the last example, we were only breaking even.
For though, we do actually compress by bits.

What could be the shortest program that produces a larger output?
The following is a good candidate: the program
,
of size 55 bits, uses Church numerals to output exactly ones.
That beats both gzip
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

 and bzip2
Bzip2
bzip2 is a free and open source implementation of the Burrows–Wheeler algorithm. It is developed and maintained by Julian Seward. Seward made the first public release of bzip2, version 0.15, in July 1996.-Compression efficiency:...

, compressors that need 360 and 352 bits respectively, to describe the same output
(as a 8192 byte file with a single letter name).

Halting probability

The halting probability of the prefix universal machine is defined as the probability it will output any term that has a normal form (this includes all translated strings):

With some effort, we can determine the first 4 bits of this particular number of wisdom:
where probability .00012 = 2−4 is already contributed by programs 00100 and 00101 for terms True and False.

BLC8: byte sized I/O

While bit streams are nice in theory, they fare poorly in interfacing with the real world.
The language BLC8 is a more practical variation on BLC in which programs operate on a stream of bytes,
where each byte is represented as a delimited list of 8 bits in big-endian order.

BLC8 requires a more complicated universal machine:



The parser in U8 keeps track of both remaining bytes, and remaining bits in the current byte, discarding the latter when parsing is completed. Thus the size of U8, which is 355 bits as a BLC program, is 45 bytes in BLC8.

Brainfuck

The following BLC8 program

is an unbounded tape Brainfuck
Brainfuck
The brainfuck programming language is an esoteric programming language noted for its extreme minimalism. It is a Turing tarpit, designed to challenge and amuse programmers, and is not suitable for practical use...

 interpreter in 893 bits (under 112 bytes).
The formatting obscures the occurrence of double digit indices: a 10 on line 1, and an 11 and three 12s on line 3.

This provides a nice example of the property that universal description methods differ by at most
a constant in complexity. Writing a BLC8 interpreter in Brainfuck, which would provide a
matching upper bound in the other direction, is left as an exercise for die-hard Brainfuck programmers.

The interpreter expects its input to consist of a Brainfuck program followed by a ] followed by the input for the Brainfuck program. The interpreter only looks at bits 0,1,4 of each character to determine which of ,-.+<>][ it is, so any characters other than those 8 should be stripped from a Brainfuck program.
Consuming more input than is available results in an error (the non-list result ).

This interpreter is a rough translation of the following version written in Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...



import System
import Control.Monad.State
import Control.Monad.Writer
import Text.ParserCombinators.Parsec

putc = do ( _, _,b, _) <- get; tell [b]
getc = do ( left, right,_,b:input) <- get; put ( left, right, b,input)
prev = do (l:left, right,b, input) <- get; put ( left,b:right, l,input)
next = do ( left,r:right,b, input) <- get; put (b:left, right, r,input)
decr = do ( left, right,b, input) <- get; put ( left, right,pred b,input)
incr = do ( left, right,b, input) <- get; put ( left, right,succ b,input)
loop body = do (_,_,b,_) <- get; when (b /= '\0') (body >> loop body)
parseInstr = liftM loop (between (char '[') (char ']') parseInstrs)
<|> (char '<' >> return prev)
<|> (char '>' >> return next)
<|> (char '-' >> return decr)
<|> (char '+' >> return incr)
<|> (char '.' >> return putc)
<|> (char ',' >> return getc)
<|> (noneOf "]" >> return (return ))
parseInstrs = liftM sequence_ (many parseInstr)
main = do [name] <- getArgs
source <- readFile name
input <- getContents
let init = (repeat '\0', repeat '\0', '\0', input)
putStrLn $ either show (execWriter . (`evalStateT` init)) (parse parseInstrs name source)

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK