Weasel program
Encyclopedia
The weasel program, Dawkins' weasel, or the Dawkins weasel is a thought experiment
and a variety of computer simulation
s illustrating it. Their aim is to demonstrate that the process that drives evolution
ary systems — random variation
combined with non-random cumulative selection
— is different from pure chance.
The thought experiment was formulated by Richard Dawkins
, and the first simulation written by him; various other implementations of the program have been written by others.
, Dawkins gave the following introduction to the program, referencing the well-known infinite monkey theorem
:
The scenario is staged to produce a string of gibberish
letters, assuming that the selection of each letter in a sequence of 28 characters will be random. The number of possible combinations in this random sequence is 2728, or about 1040, so the probability
that the monkey will produce a given sequence is extremely low. Any particular sequence of 28 characters could be selected as a "target" phrase, all equally as improbable as Dawkins's chosen target, "METHINKS IT IS LIKE A WEASEL".
A computer program
could be written to carry out the actions of Dawkins's hypothetical
monkey, continuously generating combinations of 26 letters and spaces at high speed. Even at the rate of millions of combinations per second, it is unlikely, even given the entire lifetime of the universe
to run, that the program would ever produce the phrase "METHINKS IT IS LIKE A WEASEL".
Dawkins intends this example to illustrate a common misunderstanding of evolution
ary change, i.e. that DNA
sequences or organic compound
s such as proteins are the result of atoms randomly combining to form more complex structures. In these types of computations, any sequence of amino acids in a protein will be extraordinarily improbable (this is known as Hoyle's fallacy
). Rather, evolution proceeds by hill climbing
, as in adaptive landscapes.
Dawkins then goes on to show that a process of cumulative selection can take far fewer steps to reach any given target. In Dawkins's words:
By repeating the procedure, a randomly generated sequence of 28 letters and spaces will be gradually changed each generation
. The sequences progress through each generation:
Dawkins continues:
s) can produce meaningful combinations in a relatively short time as long as there is some mechanism to select cumulative changes, whether it is a person identifying which traits are desirable (in the case of artificial selection) or a criterion of survival ("fitness") imposed by the environment (in the case of natural selection). Reproducing systems tend to preserve traits across generations, because the offspring inherit a copy of the parent's traits. It is the differences between offspring, the variations in copying, which become the basis for selection, allowing phrases closer to the target to survive, and the remaining variants to "die."
Dawkins discusses the issue of the mechanism of selection with respect to his "biomorphs" program:
Regarding the example's applicability to biological evolution, he is careful to point out that it has its limitations:
has criticized its assumption that the intermittent stages of such a progression will be selected by evolutionary principles, and asserts that many genes that are useful in tandem would not have arisen independently.
It is often suggested that the program works by "locking" a correct letter when it is found. Robert C. Newman, for example, misunderstands the basic algorithm:
(Mere Creation, p 437)
This misunderstanding has been frequently repeated in the creationist and ID community. Creation Ministries claims that "Once a letter falls into place, Dawkin's program ensures it won't mutate away",. While this is not strictly correct, as the 25th iteration of the sample run to the right shows, the conservation of overall similarity to a target of a kind that Dawkins himself acknowledges is foreign to the evolutionary process seems to be a valid caution against accepting the model as a proof, rather than an interesting demonstration of the way characters could be preserved from generation to generation given an appropriate selection mechanism.
Dawkins broached several of these issues himself in The Blind Watchmaker, and has also responded to these criticisms by pointing out that the program was never intended to model biological evolution accurately, and that he very specifically described it as an artificial selection process from the outset, as the citation above shows. It was only meant to demonstrate the power of cumulative selection as compared to random selection, and show the complete unrealism of the popular notion of natural selection as "monkeys pounding on typewriters". These cautions need to be borne in mind as a qualification of Dawkins' enthusiastic rhetorical use of the model in The Blind Watchmaker.
sets of line segments which bear relationships to each other, drawn under the control of "genes" that determine the appearance of the biomorph. By selecting entities from sequential generations of biomorphs, an experimenter can guide the evolution of the figures toward given shapes, such as "airplane" or "octopus" biomorphs.
As a simulation, the biomorphs are not much closer to the actual genetic behavior of biological organisms. Like the Weasel program, their development is shaped by an external factor, in this case the decisions of the experimenter who chooses which of many possible shapes will go forward into the following generation. They do however serve to illustrate the concept of "genetic space," where each possible gene is treated as a dimension
, and the actual genomes of living organisms make up a tiny fraction of all possible gene combinations, most of which will not produce a viable organism. As Dawkins puts it, "however many ways there may be of being alive, it is certain that there are vastly more ways of being dead".
In Climbing Mount Improbable, Dawkins responded to the limitations of the Weasel program by describing programs, written by other parties, that modeled the evolution of the spider web
. He suggested that these programs were more realistic models of the evolutionary process, since they had no predetermined goal other than coming up with a web that caught more flies through a "trial and error" process. Spiderwebs were seen as good topics for evolutionary modeling because they were simple examples of biosystems that were easily visualized; the modeling programs successfully generated a range of spider webs similar to those found in nature.
For these purposes, a "character" is any uppercase letter, or a space. The number of copies per generation, and the chance of mutation per letter are not specified in Dawkins's book; 100 copies and a 5% mutation rate are examples. Correct letters are not "locked". Each correct letter may become incorrect in subsequent generations. The terms of the program and the existence of the target phrase do however mean that such 'negative mutations' will quickly be 'corrected'.
Thought experiment
A thought experiment or Gedankenexperiment considers some hypothesis, theory, or principle for the purpose of thinking through its consequences...
and a variety of computer simulation
Computer simulation
A computer simulation, a computer model, or a computational model is a computer program, or network of computers, that attempts to simulate an abstract model of a particular system...
s illustrating it. Their aim is to demonstrate that the process that drives evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...
ary systems — random variation
Mutation
In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...
combined with non-random cumulative selection
Selection
In the context of evolution, certain traits or alleles of genes segregating within a population may be subject to selection. Under selection, individuals with advantageous or "adaptive" traits tend to be more successful than their peers reproductively—meaning they contribute more offspring to the...
— is different from pure chance.
The thought experiment was formulated by Richard Dawkins
Richard Dawkins
Clinton Richard Dawkins, FRS, FRSL , known as Richard Dawkins, is a British ethologist, evolutionary biologist and author...
, and the first simulation written by him; various other implementations of the program have been written by others.
Overview
In chapter 3 of his book The Blind WatchmakerThe Blind Watchmaker
The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design is a 1986 book by Richard Dawkins in which he presents an explanation of, and argument for, the theory of evolution by means of natural selection. He also presents arguments to refute certain criticisms made on...
, Dawkins gave the following introduction to the program, referencing the well-known infinite monkey theorem
Infinite monkey theorem
The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare....
:
The scenario is staged to produce a string of gibberish
Gibberish
Gibberish is a generic term in English for talking that sounds like speech, but carries no actual meaning. This meaning has also been extended to meaningless text or gobbledygook. The common theme in gibberish statements is a lack of literal sense, which can be described as a presence of nonsense...
letters, assuming that the selection of each letter in a sequence of 28 characters will be random. The number of possible combinations in this random sequence is 2728, or about 1040, so the probability
Probability
Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...
that the monkey will produce a given sequence is extremely low. Any particular sequence of 28 characters could be selected as a "target" phrase, all equally as improbable as Dawkins's chosen target, "METHINKS IT IS LIKE A WEASEL".
A computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
could be written to carry out the actions of Dawkins's hypothetical
Hypothesis
A hypothesis is a proposed explanation for a phenomenon. The term derives from the Greek, ὑποτιθέναι – hypotithenai meaning "to put under" or "to suppose". For a hypothesis to be put forward as a scientific hypothesis, the scientific method requires that one can test it...
monkey, continuously generating combinations of 26 letters and spaces at high speed. Even at the rate of millions of combinations per second, it is unlikely, even given the entire lifetime of the universe
Age of the universe
The age of the universe is the time elapsed since the Big Bang posited by the most widely accepted scientific model of cosmology. The best current estimate of the age of the universe is 13.75 ± 0.13 billion years within the Lambda-CDM concordance model...
to run, that the program would ever produce the phrase "METHINKS IT IS LIKE A WEASEL".
Dawkins intends this example to illustrate a common misunderstanding of evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...
ary change, i.e. that DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
sequences or organic compound
Organic compound
An organic compound is any member of a large class of gaseous, liquid, or solid chemical compounds whose molecules contain carbon. For historical reasons discussed below, a few types of carbon-containing compounds such as carbides, carbonates, simple oxides of carbon, and cyanides, as well as the...
s such as proteins are the result of atoms randomly combining to form more complex structures. In these types of computations, any sequence of amino acids in a protein will be extraordinarily improbable (this is known as Hoyle's fallacy
Hoyle's fallacy
Hoyle's Fallacy, sometimes called the junkyard tornado, is a term for Fred Hoyle's flawed statistical analysis applied to evolutionary origins, in which he compares the probability of cellular life evolving to the chance of a tornado "sweeping through a junkyard" and assembling a functional aeroplane...
). Rather, evolution proceeds by hill climbing
Hill climbing
In computer science, hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem, then attempts to find a better solution by incrementally changing a single element of the solution...
, as in adaptive landscapes.
Dawkins then goes on to show that a process of cumulative selection can take far fewer steps to reach any given target. In Dawkins's words:
By repeating the procedure, a randomly generated sequence of 28 letters and spaces will be gradually changed each generation
Generation
Generation , also known as procreation in biological sciences, is the act of producing offspring....
. The sequences progress through each generation:
- Generation 01: WDLTMNLT DTJBKWIRZREZLMQCO P
- Generation 02: WDLTMNLT DTJBSWIRZREZLMQCO P
- Generation 10: MDLDMNLS ITJISWHRZREZ MECS P
- Generation 20: MELDINLS IT ISWPRKE Z WECSEL
- Generation 30: METHINGS IT ISWLIKE B WECSEL
- Generation 40: METHINKS IT IS LIKE I WEASEL
- Generation 43: METHINKS IT IS LIKE A WEASEL
Dawkins continues:
Implications for biology
The program aims to demonstrate that the preservation of small changes in an evolving string of characters (or geneGene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
s) can produce meaningful combinations in a relatively short time as long as there is some mechanism to select cumulative changes, whether it is a person identifying which traits are desirable (in the case of artificial selection) or a criterion of survival ("fitness") imposed by the environment (in the case of natural selection). Reproducing systems tend to preserve traits across generations, because the offspring inherit a copy of the parent's traits. It is the differences between offspring, the variations in copying, which become the basis for selection, allowing phrases closer to the target to survive, and the remaining variants to "die."
Dawkins discusses the issue of the mechanism of selection with respect to his "biomorphs" program:
Regarding the example's applicability to biological evolution, he is careful to point out that it has its limitations:
Criticism
Dawkins's "weasel program" has been the subject of much debate. Intelligent Design proponent William A. DembskiWilliam A. Dembski
William Albert "Bill" Dembski is an American proponent of intelligent design, well known for promoting the concept of specified complexity...
has criticized its assumption that the intermittent stages of such a progression will be selected by evolutionary principles, and asserts that many genes that are useful in tandem would not have arisen independently.
It is often suggested that the program works by "locking" a correct letter when it is found. Robert C. Newman, for example, misunderstands the basic algorithm:
(Mere Creation, p 437)
This misunderstanding has been frequently repeated in the creationist and ID community. Creation Ministries claims that "Once a letter falls into place, Dawkin's program ensures it won't mutate away",. While this is not strictly correct, as the 25th iteration of the sample run to the right shows, the conservation of overall similarity to a target of a kind that Dawkins himself acknowledges is foreign to the evolutionary process seems to be a valid caution against accepting the model as a proof, rather than an interesting demonstration of the way characters could be preserved from generation to generation given an appropriate selection mechanism.
Dawkins broached several of these issues himself in The Blind Watchmaker, and has also responded to these criticisms by pointing out that the program was never intended to model biological evolution accurately, and that he very specifically described it as an artificial selection process from the outset, as the citation above shows. It was only meant to demonstrate the power of cumulative selection as compared to random selection, and show the complete unrealism of the popular notion of natural selection as "monkeys pounding on typewriters". These cautions need to be borne in mind as a qualification of Dawkins' enthusiastic rhetorical use of the model in The Blind Watchmaker.
More complex models
In The Blind Watchmaker, Dawkins goes on to provide a graphical model of gene selection involving entities he calls biomorphs. These are two-dimensionalPlane (mathematics)
In mathematics, a plane is a flat, two-dimensional surface. A plane is the two dimensional analogue of a point , a line and a space...
sets of line segments which bear relationships to each other, drawn under the control of "genes" that determine the appearance of the biomorph. By selecting entities from sequential generations of biomorphs, an experimenter can guide the evolution of the figures toward given shapes, such as "airplane" or "octopus" biomorphs.
As a simulation, the biomorphs are not much closer to the actual genetic behavior of biological organisms. Like the Weasel program, their development is shaped by an external factor, in this case the decisions of the experimenter who chooses which of many possible shapes will go forward into the following generation. They do however serve to illustrate the concept of "genetic space," where each possible gene is treated as a dimension
Dimension
In physics and mathematics, the dimension of a space or object is informally defined as the minimum number of coordinates needed to specify any point within it. Thus a line has a dimension of one because only one coordinate is needed to specify a point on it...
, and the actual genomes of living organisms make up a tiny fraction of all possible gene combinations, most of which will not produce a viable organism. As Dawkins puts it, "however many ways there may be of being alive, it is certain that there are vastly more ways of being dead".
In Climbing Mount Improbable, Dawkins responded to the limitations of the Weasel program by describing programs, written by other parties, that modeled the evolution of the spider web
Spider web
A spider web, spiderweb, spider's web or cobweb is a device built by a spider out of proteinaceous spider silk extruded from its spinnerets....
. He suggested that these programs were more realistic models of the evolutionary process, since they had no predetermined goal other than coming up with a web that caught more flies through a "trial and error" process. Spiderwebs were seen as good topics for evolutionary modeling because they were simple examples of biosystems that were easily visualized; the modeling programs successfully generated a range of spider webs similar to those found in nature.
Example algorithm
Although Dawkins did not provide the source code for his program, a "Weasel" style algorithm could run as follows.- Start with a random string of 28 characters.
- Make 100 copies of this string, with a 5% chance per character of that character being replaced with a random character.
- Compare each new string with the target "METHINKS IT IS LIKE A WEASEL", and give each a score (the number of letters in the string that are correct and in the correct position).
- If any of the new strings has a perfect score (28), halt.
- Otherwise, take the highest scoring string, and go to step 2.
For these purposes, a "character" is any uppercase letter, or a space. The number of copies per generation, and the chance of mutation per letter are not specified in Dawkins's book; 100 copies and a 5% mutation rate are examples. Correct letters are not "locked". Each correct letter may become incorrect in subsequent generations. The terms of the program and the existence of the target phrase do however mean that such 'negative mutations' will quickly be 'corrected'.
External links
- Many examples of Weasel programs in various computer languages
- The Weasel Applet (the "weasel program" written in Java)
- Dawkin's Weasel demo applet (in Monash University's Virtual Lab)
- Almost Like a Whale, by Ian Musgrave
- Talk.origins claim CF011_1 Dawkins' WEASEL simulation
- An open sourced, HTML/Javascript web-based version by Damian Peterson
- An open sourced python script by Iddo Friedberg