Coupling from the past - AbsoluteAstronomy.com

Markov chain Monte Carlo

Markov chain Monte Carlo methods are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a large number of steps is then used as a sample of the...

(MCMC) algorithms, coupling from the past is a method for sampling

Sampling (statistics)

In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....

from the stationary distribution

Markov chain

A Markov chain, named after Andrey Markov, is a mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process characterized as memoryless: the next state depends only on the current state and not on the...

of a Markov chain. Contrary to many MCMC algorithms, coupling from the past gives in principle a perfect sample from the stationary distribution. It was invented by James Propp and David Wilson in 1996.

The basic idea

Consider a finite state irreducible

Markov chain

aperiodic

Markov chain

with state space

and (unique) stationary distribution

(

is a probability vector). Suppose that we come up with a probability distribution

on the set of maps

with the property that for every fixed

, its image

is distributed according to the transition probability of

from state

. An example of such a probability distribution is the one where

is independent from

whenever

, but it is often worthwhile to consider other distributions. Now let

for

be independent samples from

.

Suppose that

is chosen randomly according to

and is independent from the sequence

. (We do not worry for now where this

is coming from.) Then

is also distributed according to

, because

-stationary and our assumption on the law of

. Define

Then it follows by induction that

is also distributed according to

for every

. Now here is the main point. It may happen that for some

the image of the map

is a single element of

.
In other words,

for each

. Therefore, we do not need to have access to

in order to compute

. The algorithm then involves finding some

such that

is a singleton, and outputing the element of that singleton. The design of a good distribution

for which the task of finding such an

and computing

is not too costly is not always obvious, but has been accomplished successfully in several important instances.

The monotone case

There is a special class of Markov chains in which there are particularly good choices
for

and a tool for determining if

. (Here

denotes cardinality

Cardinal number

In mathematics, cardinal numbers, or cardinals for short, are a generalization of the natural numbers used to measure the cardinality of sets. The cardinality of a finite set is a natural number – the number of elements in the set. The transfinite cardinal numbers describe the sizes of infinite...

.) Suppose that

is a partially ordered set

Partially ordered set

In mathematics, especially order theory, a partially ordered set formalizes and generalizes the intuitive concept of an ordering, sequencing, or arrangement of the elements of a set. A poset consists of a set together with a binary relation that indicates that, for certain pairs of elements in the...

with order