Transposition driven scheduling
Encyclopedia
Transposition driven scheduling (TDS) is a load balancing
algorithm for parallel computing
. It was developed at the Vrije Universiteit
in Amsterdam
, The Netherlands as an algorithm to solve puzzle
s. The algorithm provides near-linear speedup with some problems and scales extremely well. It was published about by John Romein, Aske Plaat, Henri Bal
and Jonathan Schaeffer
.
In most puzzles, different ordering of actions can lead to the same position of the puzzle. In puzzles where previous actions do not influence the solution, you need to only evaluate this position once to get a solution for both paths. To avoid evaluating the same position more than once (and thus wasting computation cycles), programs written to solve these kinds of puzzles use transposition table
s. A transposition is a puzzle state that can be reached by different paths but has the same solution. Every time such a program starts evaluating a position, it first looks up in a table if the position has already been evaluated. If it has, the solution is taken from the table instead of calculated, savin large amounts of time.
However, in parallel computing, this approach has a serious drawback. To make full use of the advantages of transposition lookups, all computers in the network have to communicate their solutions to the other computers one way or the other, or you run the risk of redundantly solving some positions. This incurs a severe communication overhead, meaning that a lot of all computers' time is spent communicating with the others instead of solving the problem.
. To begin, a function is defined that assigns a unique value to every board position. Every computer in the network is assigned a range of board positions for which it holds authority. Every computer has its own transposition table and a job queue. Whenever a computer is done with its current job it fetches a new job from the queue. It then computes all possible distinct positions that can be reached from the current position in one action. This is all traditional transposition based problem solving. However, in the traditional method, the computer would now, for every position just computed, ask the computer that holds authority over that position if it has a solution for it. If not, the computer computes the solution recursively and forwards the solution to the computer whose authority it falls under. This is what causes a lot of communication overhead.
(the main cause of delays in request-response models) is not an issue, because a computer simply never waits for an answer to come back. The hardware or operating system can guarantee that the message arrives at its destination so the program does not have to worry about anything anymore after it forwards the job.
Load balancing (computing)
Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid...
algorithm for parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
. It was developed at the Vrije Universiteit
Vrije Universiteit
The Vrije Universiteit is a university in Amsterdam, Netherlands. The Dutch name is often abbreviated as VU and in English the university uses the name "VU University". The university is located on a compact urban campus in the southern part of Amsterdam in the Buitenveldert district...
in Amsterdam
Amsterdam
Amsterdam is the largest city and the capital of the Netherlands. The current position of Amsterdam as capital city of the Kingdom of the Netherlands is governed by the constitution of August 24, 1815 and its successors. Amsterdam has a population of 783,364 within city limits, an urban population...
, The Netherlands as an algorithm to solve puzzle
Puzzle
A puzzle is a problem or enigma that tests the ingenuity of the solver. In a basic puzzle, one is intended to put together pieces in a logical way in order to come up with the desired solution...
s. The algorithm provides near-linear speedup with some problems and scales extremely well. It was published about by John Romein, Aske Plaat, Henri Bal
Henri Bal
Henri Elle Bal is a professor of Computer Science at the Vrije Universiteit, Amsterdam in the Netherlands. He is a well-known researcher in computer systems with a specialization in parallel computer systems, languages, and applications....
and Jonathan Schaeffer
Jonathan Schaeffer
Jonathan Herbert Schaeffer is a Canadian researcher and professor at the University of Alberta and the Canada Research Chair in Artificial Intelligence....
.
Transposition based puzzle solving
In a puzzle, all possible plays can be represented in a tree with board positions corresponding to the nodes, moves corresponding to the edges, the initial position as the root of the tree and the solutions as leaves. Cycles in a path, i.e. moves that yield a position that is already encountered higher up in the tree, are left out of the tree because they can never lead to an optimal solution.In most puzzles, different ordering of actions can lead to the same position of the puzzle. In puzzles where previous actions do not influence the solution, you need to only evaluate this position once to get a solution for both paths. To avoid evaluating the same position more than once (and thus wasting computation cycles), programs written to solve these kinds of puzzles use transposition table
Transposition table
In computer chess and other computer games, transposition tables are used to speed up the search of the game tree. Transposition tables are primarily useful in perfect information games, meaning the entire state of the game is known to all players at all times....
s. A transposition is a puzzle state that can be reached by different paths but has the same solution. Every time such a program starts evaluating a position, it first looks up in a table if the position has already been evaluated. If it has, the solution is taken from the table instead of calculated, savin large amounts of time.
However, in parallel computing, this approach has a serious drawback. To make full use of the advantages of transposition lookups, all computers in the network have to communicate their solutions to the other computers one way or the other, or you run the risk of redundantly solving some positions. This incurs a severe communication overhead, meaning that a lot of all computers' time is spent communicating with the others instead of solving the problem.
The traditional setup
To solve this drawback, an approach has been taken that integrates solving the problem and load balancingLoad balancing (computing)
Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid...
. To begin, a function is defined that assigns a unique value to every board position. Every computer in the network is assigned a range of board positions for which it holds authority. Every computer has its own transposition table and a job queue. Whenever a computer is done with its current job it fetches a new job from the queue. It then computes all possible distinct positions that can be reached from the current position in one action. This is all traditional transposition based problem solving. However, in the traditional method, the computer would now, for every position just computed, ask the computer that holds authority over that position if it has a solution for it. If not, the computer computes the solution recursively and forwards the solution to the computer whose authority it falls under. This is what causes a lot of communication overhead.
The TDS-step
What TDS does is, instead of aksing someone else if it has the solution, is appending the problem to someone else's job queue. In other words, every time a computer has a board position for which it wants a solution, it simply sends it over the network to the responsible computer and does not worry about it anymore. Only if a problem falls within its own authority range will a computer try to look up if it has a solution stored in its own table. If not, it simply appends it to its own queue. If it does have a solution, it does not have to compute anything anymore and fetches a new job from the queue.The difference
What makes the big difference between traditional transposition based problem solving and TDS is that asking some computer if it has solved a problem follows a request-response approach, in which the computer asking the other computer has to wait for a response. In TDS, forwarding a job to another computer does not involve any waiting, because you know (by design) that the other computer will accept the job and try to solve it. This means that latencyLag
Lag is a common word meaning to fail to keep up or to fall behind. In real-time applications, the term is used when the application fails to respond in a timely fashion to inputs...
(the main cause of delays in request-response models) is not an issue, because a computer simply never waits for an answer to come back. The hardware or operating system can guarantee that the message arrives at its destination so the program does not have to worry about anything anymore after it forwards the job.