Scoreboarding
Encyclopedia
Scoreboarding is a centralized method, used in the CDC 6600
computer
, for dynamically scheduling a pipeline so that the instructions can execute out of order
when there are no conflicts and the hardware is available. In a scoreboard, the data dependencies
of every instruction are logged. Instructions are released only when the scoreboard determines that there are no conflicts with previously issued and incomplete instructions. If an instruction is stalled because it is unsafe to continue, the scoreboard monitors the flow of executing instructions until all dependencies have been resolved before the stalled instruction is issued.
function issue(op, dst, src1, src2)
wait until (!Busy[FU] AND !Result[dst]); // FU can be any functional unit that can execute operation op
Busy[FU] ← Yes;
Op[FU] ← op;
Fi[FU] ← dst;
Fj[FU] ← src1;
Fk[FU] ← src2;
Qj[FU] ← Result[src1];
Qk[FU] ← Result[src2];
Rj[FU] ← not Qj;
Rk[FU] ← not Qk;
Result[dst] ← FU;
function read_operands(FU)
wait until (Rj[FU] AND Rk[FU]);
Rj[FU] ← No;
Rk[FU] ← No;
function execute(FU)
// Execute whatever FU must do
function write_back(FU)
wait until (f {(Fj[f]≠Fi[FU] OR Rj[f]=No) AND (Fk[f]≠Fi[FU] OR Rk[f]=No)})
foreach f do
if Qj[f]=FU then Rj[f] ← Yes;
if Qk[f]=FU then Rk[f] ← Yes;
Result[Fi[FU]] ← 0;
Busy[FU] ← No;
can avoid the structural hazard and also resolve WAR and WAW dependencies with Register renaming
.
CDC 6600
The CDC 6600 was a mainframe computer from Control Data Corporation, first delivered in 1964. It is generally considered to be the first successful supercomputer, outperforming its fastest predecessor, IBM 7030 Stretch, by about three times...
computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...
, for dynamically scheduling a pipeline so that the instructions can execute out of order
Out-of-order execution
In computer engineering, out-of-order execution is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay...
when there are no conflicts and the hardware is available. In a scoreboard, the data dependencies
Data dependency
A data dependency in computer science is a situation in which a program statement refers to the data of a preceding statement. In compiler theory, the technique used to discover data dependencies among statements is called dependence analysis.There are three types of dependencies: data, name, and...
of every instruction are logged. Instructions are released only when the scoreboard determines that there are no conflicts with previously issued and incomplete instructions. If an instruction is stalled because it is unsafe to continue, the scoreboard monitors the flow of executing instructions until all dependencies have been resolved before the stalled instruction is issued.
Stages
Instructions are decoded in order and go through the following four stages.- Issue: The system checks which registers will be read and written by this instruction. This information is remembered as it will be needed in the following stages. In order to avoid output dependencies (WAW - Write after Write) the instruction is stalled until instructions intending to write to the same register are completed. The instruction is also stalled when required functional units are currently busy.
- Read operands: After an instruction has been issued and correctly allocated to the required hardware module, the instruction waits until all operands become available. This procedure resolves read dependencies (RAW - Read after Write) because registers which are intended to be written by another instruction are not considered available until they are actually written.
- Execution: When all operands have been fetched, the functional unit starts its execution. After the result is ready, the scoreboard is notified.
- Write Result: In this stage the result is about to be written to its destination register. However, this operation is delayed until earlier instructions—which intend to read registers this instruction wants to write to—have completed their read operands stage. This way, so called data dependencies (WAR - Write after Read) can be addressed.
Data structure
To control the execution of the instructions, the scoreboard maintains three status tables:- Instruction Status: Indicates, for each instruction being executed, which of the four stages it is in.
- Functional Unit Status: Indicates the state of each functional unit. Each function unit maintains 9 fields in the table:
- Busy: Indicates whether the unit is being used or not
- Op: Operation to perform in the unit (e.g. MUL, DIV or MOD)
- Fi: Destination register
- Fj,Fk: Source-register numbers
- Qj,Qk: Functional units that will produce the source registers Fj, Fk
- Rj,Rk: Flags that indicates when Fj, Fk are ready
- Register Status: Indicates, for each register, which function unit will write results into it.
The algorithm
The detailed algorithm for the scoreboard control is described below:function issue(op, dst, src1, src2)
wait until (!Busy[FU] AND !Result[dst]); // FU can be any functional unit that can execute operation op
Busy[FU] ← Yes;
Op[FU] ← op;
Fi[FU] ← dst;
Fj[FU] ← src1;
Fk[FU] ← src2;
Qj[FU] ← Result[src1];
Qk[FU] ← Result[src2];
Rj[FU] ← not Qj;
Rk[FU] ← not Qk;
Result[dst] ← FU;
function read_operands(FU)
wait until (Rj[FU] AND Rk[FU]);
Rj[FU] ← No;
Rk[FU] ← No;
function execute(FU)
// Execute whatever FU must do
function write_back(FU)
wait until (f {(Fj[f]≠Fi[FU] OR Rj[f]=No) AND (Fk[f]≠Fi[FU] OR Rk[f]=No)})
foreach f do
if Qj[f]=FU then Rj[f] ← Yes;
if Qk[f]=FU then Rk[f] ← Yes;
Result[Fi[FU]] ← 0;
Busy[FU] ← No;
Remarks
The scoreboarding method must stall the issue stage when there is no functional unit available. In this case, future instructions that could potentially be executed will wait until the structural hazard is resolved. Some other techniques like Tomasulo algorithmTomasulo algorithm
The Tomasulo algorithm is a hardware algorithm developed in 1967 by Robert Tomasulo from IBM. It allows sequential instructions that would normally be stalled due to certain dependencies to execute non-sequentially...
can avoid the structural hazard and also resolve WAR and WAW dependencies with Register renaming
Register renaming
In computer architecture, register renaming refers to a technique used to avoid unnecessary serialization of program operations imposed by the reuse of registers by those operations.-Problem definition:...
.
See also
- Instruction level parallelismInstruction level parallelismInstruction-level parallelism is a measure of how many of the operations in a computer program can be performed simultaneously. Consider the following program: 1. e = a + b 2. f = c + d 3. g = e * f...
- Tomasulo algorithmTomasulo algorithmThe Tomasulo algorithm is a hardware algorithm developed in 1967 by Robert Tomasulo from IBM. It allows sequential instructions that would normally be stalled due to certain dependencies to execute non-sequentially...
- Out-of-order executionOut-of-order executionIn computer engineering, out-of-order execution is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay...
External links
- Dynamic Scheduling - Scoreboard
- Computer Architecture: A Quantitative Approach, John L. Hennessy & David A. Patterson
- EECS 252 Graduate Computer Architecture Lec XX - TOPIC, Electrical Engineering and Computer Sciences, Berkeley, University of California.