Moran process
Encyclopedia
A Moran process, named after Patrick Moran, is a stochastic process
used in biology
to describe finite populations. It can be used to model variety-increasing processes such as mutation
as well as variety-reducing effects such as genetic drift
and natural selection
. The process can describe the probabilistic dynamics in a finite population of constant size N in which two allele
s A and B are competing for dominance. The two alleles are considered to be true replicator
s (i.e. entities that make copies of themselves). In each time step a random individual (which is of either type A or B) is chosen for reproduction and a random individual is chosen for death; thus ensuring that the population size remains constant. To model selection, one type has to have a higher fitness and is thus more likely to be chosen for reproduction.
The same individual can be chosen for death and for reproduction in the same step.
can spread throughout a population, so that eventually the original allele
is lost. A neutral mutation does not bring any fitness
advantage or disadvantage to its bearer. The simple case of the Moran process can describe this phenomenon.
If the number of A individuals is given by i then the Moran process is defined on the state space i = 0, ..., N. Since the number of A individuals can change at most by one at each time step, a transition exists only between state i and state i − 1, i and i + 1. Thus the transition matrix of the stochastic process is tri-diagonal in shape and the transition probabilities are
Stochastic process
In probability theory, a stochastic process , or sometimes random process, is the counterpart to a deterministic process...
used in biology
Biology
Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, origin, evolution, distribution, and taxonomy. Biology is a vast subject containing many subdivisions, topics, and disciplines...
to describe finite populations. It can be used to model variety-increasing processes such as mutation
Mutation
In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...
as well as variety-reducing effects such as genetic drift
Genetic drift
Genetic drift or allelic drift is the change in the frequency of a gene variant in a population due to random sampling.The alleles in the offspring are a sample of those in the parents, and chance has a role in determining whether a given individual survives and reproduces...
and natural selection
Natural selection
Natural selection is the nonrandom process by which biologic traits become either more or less common in a population as a function of differential reproduction of their bearers. It is a key mechanism of evolution....
. The process can describe the probabilistic dynamics in a finite population of constant size N in which two allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...
s A and B are competing for dominance. The two alleles are considered to be true replicator
Replicator
Replicator may refer to various things related to replication and self-replication:* The theoretical basic unit of evolution in some schools of evolutionary theory* Replicator * Clanking replicator* DNA replicationIn culture:...
s (i.e. entities that make copies of themselves). In each time step a random individual (which is of either type A or B) is chosen for reproduction and a random individual is chosen for death; thus ensuring that the population size remains constant. To model selection, one type has to have a higher fitness and is thus more likely to be chosen for reproduction.
The same individual can be chosen for death and for reproduction in the same step.
Neutral drift
Neutral drift is the idea that a neutral mutationNeutral mutation
In genetics, a neutral mutation is a mutation that has no effect on fitness. In other words, it is neutral with respect to natural selection.For example, some mutations in a DNA triplet or codon do not change which amino acid is introduced: this is known as a synonymous substitution. Unless the...
can spread throughout a population, so that eventually the original allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...
is lost. A neutral mutation does not bring any fitness
Fitness (biology)
Fitness is a central idea in evolutionary theory. It can be defined either with respect to a genotype or to a phenotype in a given environment...
advantage or disadvantage to its bearer. The simple case of the Moran process can describe this phenomenon.
If the number of A individuals is given by i then the Moran process is defined on the state space i = 0, ..., N. Since the number of A individuals can change at most by one at each time step, a transition exists only between state i and state i − 1, i and i + 1. Thus the transition matrix of the stochastic process is tri-diagonal in shape and the transition probabilities are
-
The entry denotes the probability to go from state i to state j. To understand the formulas for the transition probabilities one has to look at the definition of the process which states that always one individual will be chosen for reproduction and one is chosen for death. Once the A individuals have died out, they will never be reintroduced into the population since the process does not model mutationMutationIn molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...
s (A cannot be reintroduced into the population once it has died out and vice versa) and thus . For the same reason the population of A individuals will always stay N once they have reached that number and taken over the population and thus . The states 0 and N are called absorbing while the states 1, ..., N − 1 are called transient. The intermediate transition probabilities can be explained by considering the first term to be the probability to choose the individual whose abundance will increase by one and the second term the probability to choose the other type for death. Obviously, if the same type is chosen for reproduction and for death, then the abundance of one type does not change.
Eventually the population will reach one of the absorbing states and then stay there forever. In the transient states, random fluctuations will occur but eventually the population of A will either go extinct or reach fixation. This is one of the most important differences to deterministic processes which cannot model random events.
The expected valueExpected valueIn probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...
and the varianceVarianceIn probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
of the number of A individuals X(t) at timepoint t can be computed when an initial state X(0) = i is given:
-
The probability of A to reach fixation is called fixation probability. For the simple Moran process this probability is
-
Since all individuals have the same fitness, they also have the same chance of becoming the ancestor of the whole population; this probability is 1 / N and thus the sum of all i probabilities (for all A individuals) is just i / N.
The mean time to absorption starting in state i is given by
-
For large N the approximation
-
holds.
Selection
If one allele has a fitness advantageFitness (biology)Fitness is a central idea in evolutionary theory. It can be defined either with respect to a genotype or to a phenotype in a given environment...
over the other allele, it will be more likely to be chosen for reproduction. This can be incorporated into the model if individuals with alleleAlleleAn allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...
A have fitness and individuals with alleleAlleleAn allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...
B have fitness where i is the number of individuals of type A; thus describing a general birth-death process.
The transition matrix of the stochastic process is tri-diagonal in shape and the transition probabilities are-
The entry denotes the probability to go from state i to state j. To understand the formulas for the transition probabilities one has to look again at the definition of the process and see that the fitness enters only the first term in the equations which is concerned with reproduction. Thus the probability that individual A is chosen for reproduction is not i / N any more but dependent on the fitness of A and thus . Also in this case, fixation probabilities when starting in state i is defined by recurrence-
And the closed form is given by-
where per definition and will just be for the general case.
This general case where the fitness of A and B depends on the abundance of each type is studied in evolutionary game theoryEvolutionary game theoryEvolutionary game theory is the application of Game Theory to evolving populations of lifeforms in biology. EGT is useful in this context by defining a framework of contests, strategies and analytics into which Darwinian competition can be modelled. It originated in 1973 with John Maynard Smith...
.
Less complex results are obtained if a constant fitness difference r is assumed. Individuals of type A reproduce with a constant rate r and individuals with allele B reproduce with rate 1. Thus if A has a fitness advantage over B, r will be larger than one, otherwise it will be smaller than one.
Thus the transition matrix of the stochastic process is tri-diagonal in shape and the transition probabilities are-
In this case is a constant factor for each composition of the population and thus the fixation probability from equation (1) simplifies to-
where the fixation probability of a single mutant A in a population of otherwise all B is often of interest and is denoted by .
Also in the case of selection, the expected value and the variance of the number of A individuals may be computed-
where p = i/N and r = 1 + s.
Rate of evolution
In a population of all B individuals, a single mutant A will take over the whole population with the probability
If the mutation rateMutation rateIn genetics, the mutation rate is the chance of a mutation occurring in an organism or gene in each generation...
(to go from the B to the A allele) in the population is u then the rate with which one member of the population will mutate to A is given by N x u and the rate with which the whole population goes from all B to all A is the rate that a single mutant A arises times the probability that it will take over the population (fixation probability):
-
Thus if the mutation is neutral (i.e. the fixation probability is just 1/N) then the rate with which an allele arises and takes over a population is independent of the population size and is equal to the mutation rate. This important result is the basis of the neutral theory of evolution and suggests that the number of observed point mutations in the genomeGenomeIn modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....
s of two different speciesSpeciesIn biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...
would simply be given by the mutation rate multiplied by two times the time since divergenceDivergenceIn vector calculus, divergence is a vector operator that measures the magnitude of a vector field's source or sink at a given point, in terms of a signed scalar. More technically, the divergence represents the volume density of the outward flux of a vector field from an infinitesimal volume around...
. Thus the neutral theory of evolution provides a molecular clockMolecular clockThe molecular clock is a technique in molecular evolution that uses fossil constraints and rates of molecular change to deduce the time in geologic history when two species or other taxa diverged. It is used to estimate the time of occurrence of events called speciation or radiation...
, given that the assumptions are fulfilled which may not be the case in reality.
Literature
- Nowak, Martin A: Evolutionary Dynamics: Exploring the Equations of Life. Belknap Press (2006) ISBN 978-0674023383
- Moran, Patrick Alfred Pierce: The Statistical Processes of Evolutionary Theory. Oxford, Clarendon Press (1962).
-
-
-
-
-
-
-
-
-
-