Backward induction
Encyclopedia
Backward induction is the process of reasoning backwards in time, from the end of a problem or situation, to determine a sequence of optimal actions. It proceeds by first considering the last time a decision might be made and choosing what to do in any situation at that time. Using this information, one can then determine what to do at the second-to-last time of decision. This process continues backwards until one has determined the best action for every possible situation (i.e. for every possible information set
Information set
In game theory, an information set is a set that, for a particular player, establishes all the possible moves that could have taken place in the game so far, given what that player has observed. If the game has perfect information, every information set contains only one member, namely the point...

) at every point in time.

In the mathematical optimization
Optimization (mathematics)
In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....

 method of dynamic programming
Dynamic programming
In mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure...

, backward induction is one of the main methods for solving the Bellman equation
Bellman equation
A Bellman equation , named after its discoverer, Richard Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming...

. In game theory
Game theory
Game theory is a mathematical method for analyzing calculated circumstances, such as in games, where a person’s success is based upon the choices of others...

, backward induction is a method used to compute subgame perfect equilibria in sequential game
Sequential game
In game theory, a sequential game is a game where one player chooses his action before the others choose theirs. Importantly, the later players must have some information of the first's choice, otherwise the difference in time would have no strategic effect...

s. The only difference is that optimization involves just one decision maker
Decision theory
Decision theory in economics, psychology, philosophy, mathematics, and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision...

, who chooses what do at each point of time, whereas game theory analyzes how the decisions of several player
Player (game)
A player of a game is a participant therein. The term 'player' is used with this same meaning both in game theory and in ordinary recreational games....

s interact. That is, by anticipating what the last player will do in each situation, it is possible to determine what the second-to-last player will do, and so on. In the related fields of automated planning and scheduling
Automated planning and scheduling
Automated planning and scheduling is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are...

 and automated theorem proving
Automated theorem proving
Automated theorem proving or automated deduction, currently the most well-developed subfield of automated reasoning , is the proving of mathematical theorems by a computer program.- Decidability of the problem :...

, the method is called backward search or backward chaining
Backward chaining
Backward chaining is an inference method that can be described as working backward from the goal...

. In chess it is called retrograde analysis
Retrograde analysis
In chess, retrograde analysis is a computational method used to solve game positions for optimal play by working backward from known outcomes , such as the construction of endgame tablebases. In game theory at large, this method is called backward induction...

.

Backward induction has been used to solve games as long as the field of game theory has existed. John von Neumann
John von Neumann
John von Neumann was a Hungarian-American mathematician and polymath who made major contributions to a vast number of fields, including set theory, functional analysis, quantum mechanics, ergodic theory, geometry, fluid dynamics, economics and game theory, computer science, numerical analysis,...

 and Oskar Morgenstern
Oskar Morgenstern
Oskar Morgenstern was a German-born Austrian-School economist. He, along with John von Neumann, helped found the mathematical field of game theory ....

 suggested solving zero-sum
Zero-sum
In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which a participant's gain of utility is exactly balanced by the losses of the utility of other participant. If the total gains of the participants are added up, and the total losses are...

, two-person games by backward induction in their Theory of Games and Economic Behavior (1944), the book which established game theory as a field of study.

An example of decision-making by backward induction

Consider an unemployed person who will be able to work for ten more years t = 1,2,...,10. Suppose that each year in which she remains unemployed, she may be offered a 'good' job that pays $100, or a 'bad' job that pays $44, with equal probability (50/50). Once she accepts a job, she will remain in that job for the rest of the ten years. (Assume for simplicity that she cares only about her monetary earnings, and that she values earnings at different times equally, i.e., the discount rate
Time preference
In economics, time preference pertains to how large a premium a consumer places on enjoyment nearer in time over more remote enjoyment....

 is zero.)

Should this person accept bad jobs? To answer this question, we can reason backwards from time t = 10.
  • At time 10, the value of accepting a good job is $100; the value of accepting a bad job is $44; the value of rejecting the job that is available is zero. Therefore, if she is still unemployed in the last period, she should accept whatever job she is offered at that time.
  • At time 9, the value of accepting a good job is $200 (because that job will last for two years); the value of accepting a bad job is 2*$44 = $88. The value of rejecting a job offer is $0 now, plus the value of waiting for the next job offer, which will either be $44 with 50% probability or $100 with 50% probability, for an average ('expected') value of 0.5*($100+$44) = $72. Therefore regardless of whether the job available at time 9 is good or bad, it is better to accept that offer than wait for a better one.
  • At time 8, the value of accepting a good job is $300 (it will last for three years); the value of accepting a bad job is 3*$44 = $132. The value of rejecting a job offer is $0 now, plus the value of waiting for a job offer at time 9. Since we have already concluded that offers at time 9 should be accepted, the expected value of waiting for a job offer at time 9 is 0.5*($200+$88) = $144. Therefore at time 8, it is more valuable to wait for the next offer than to accept a bad job.


It can be verified by continuing to work backwards that bad offers should only be accepted if one is still unemployed at times 9 or 10; they should be rejected at all times up to t = 8. The intuition is that if one expects to work in a job for a long time, this makes it more valuable to be picky about what job to accept.

A dynamic optimization problem of this kind is called an optimal stopping
Optimal stopping
In mathematics, the theory of optimal stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be found in areas of statistics, economics, and mathematical finance...

 problem, because the issue at hand is when to stop waiting for a better offer. Search theory
Search theory
In microeconomics, search theory studies buyers or sellers who cannot instantly find a trading partner, and must therefore search for a partner prior to transacting....

 is the field of microeconomics that applies problems of this type to contexts like shopping, job search, and marriage.

An example of backward induction in game theory

Consider the ultimatum game
Ultimatum game
The ultimatum game is a game often played in economic experiments in which two players interact to decide how to divide a sum of money that is given to them. The first player proposes how to divide the sum between the two players, and the second player can either accept or reject this proposal. ...

, where one player proposes to split a dollar with another. The first player (the proposer) suggests a division of the dollar between the two players. The second player is then given the option to either accept the split or reject it. If the second player accepts, both get the amount suggested by the proposer. If rejected, neither receives anything.

Consider the actions of the second player given any arbitrary proposal by the first player (that gives the second player more than zero). Since the only choice the second player has at each of these points in the game is to choose between something and nothing, one can expect that the second will accept. Given that the second will accept all proposals offered by the first (that give the second anything at all), the first ought to propose giving the second as little as possible. This is the unique subgame perfect equilibrium
Subgame perfect equilibrium
In game theory, a subgame perfect equilibrium is a refinement of a Nash equilibrium used in dynamic games. A strategy profile is a subgame perfect equilibrium if it represents a Nash equilibrium of every subgame of the original game...

 of the Ultimatum Game. (However, the Ultimatum Game does have several other Nash equilibria
Nash equilibrium
In game theory, Nash equilibrium is a solution concept of a game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally...

 which are not subgame perfect.)

See also centipede game
Centipede game
In game theory, the centipede game, first introduced by Rosenthal , is an extensive form game in which two players take turns choosing either to take a slightly larger share of a slowly increasing pot, or to pass the pot to the other player...

.

Backward induction and economic entry

Consider a dynamic game in which the players are an incumbent firm in an industry and a potential entrant to that industry. As it stands, the incumbent has a monopoly
Monopoly
A monopoly exists when a specific person or enterprise is the only supplier of a particular commodity...

 over the industry and does not want to lose some of its market share to the entrant. If the entrant chooses not to enter, the payoff to the incumbent is high (it maintains its monopoly) and the entrant neither loses nor gains (its payoff is zero). If the entrant enters, the incumbent can "fight" or "accommodate" the entrant. It will fight by lowering its price, running the entrant out of business (and incurring exit costs — a negative payoff) and damaging its own profits. If it accommodates the entrant it will lose some of its sales, but a high price will be maintained and it will receive greater profits than by lowering its price (but lower than monopoly profits).

Say that, the best response of the incumbent is to accommodate if the entrant enters. If the incumbent accommodates, the best response of the entrant is to enter (and gain profit). Hence the strategy profile in which the incumbent accommodates if the entrant enters and the entrant enters if the incumbent accommodates is a Nash equilibrium
Nash equilibrium
In game theory, Nash equilibrium is a solution concept of a game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally...

. However, if the incumbent is going to play fight, the best response of the entrant is to not enter. If the entrant does not enter, it does not matter what the incumbent chooses to do (since there is no other firm to do it to — note that if the entrant does not enter, fight and accommodate yield the same payoffs to both players; the incumbent will not lower its prices if the entrant does not enter). Hence fight is a best response of the incumbent if the entrant does not enter. Hence the strategy profile in which the incumbent fights if the entrant does not enter and the entrant does not enter if the incumbent fights is a Nash equilibrium. Since the game is dynamic, any claim by the incumbent that it will fight is not a credible threat because by the time the decision node is reached where it can decide to fight (i.e. the entrant has entered), it would be irrational to do so. Therefore this Nash equilibrium can be eliminated by backward induction.

A paradox of backward induction

The unexpected hanging paradox
Unexpected hanging paradox
The unexpected hanging paradox, hangman paradox, unexpected exam paradox, surprise test paradox or prediction paradox is a paradox about a person's expectations about the timing of a future event The unexpected hanging paradox, hangman paradox, unexpected exam paradox, surprise test paradox or...

 is a paradox
Paradox
Similar to Circular reasoning, A paradox is a seemingly true statement or group of statements that lead to a contradiction or a situation which seems to defy logic or intuition...

 related to backward induction. Suppose a prisoner is told that she will be hanged sometime between Monday and Friday of next week. However, the exact day will be a surprise (i.e. she will not know the night before that she will be executed the next day). The prisoner, interested in outsmarting her executioner, attempts to determine which day the execution will occur.

She reasons that it cannot occur on Friday, since if it had not occurred by the end of Thursday, she would know the execution would be on Friday. Therefore she can eliminate Friday as a possibility. With Friday eliminated, she decides that it cannot occur on Thursday, since if it had not occurred on Wednesday, she would know that it had to be on Thursday. Therefore she can eliminate Thursday. This reasoning proceeds until she has eliminated all possibilities. She concludes that she will not be hanged next week.

To her surprise, she is hanged on Wednesday.

Here the prisoner reasons by backward induction, but seems to come to a false conclusion. Note, however, that the description of the problem assumes it is possible to surprise someone who is performing backward induction. The mathematical theory of backward induction does not make this assumption, so the paradox does not call into question the results of this theory. Nonetheless, this paradox has received some substantial discussion by philosophers
Philosophy
Philosophy is the study of general and fundamental problems, such as those connected with existence, knowledge, values, reason, mind, and language. Philosophy is distinguished from other ways of addressing such problems by its critical, generally systematic approach and its reliance on rational...

.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK