Hardware scout
Encyclopedia
Hardware scout is a technique that uses otherwise idle processor
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

 execution resources to perform prefetching
Instruction prefetch
In computer architecture, instruction prefetch is a technique used in microprocessors to speed up the execution of a program by reducing wait states....

 during cache
CPU cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

 misses. When a thread is stalled by a cache miss, the processor pipeline checkpoints the register file
Register file
A register file is an array of processor registers in a central processing unit . Modern integrated circuit-based register files are usually implemented by way of fast static RAMs with multiple ports...

, switches to runahead
Runahead
Runahead is a technique that allows a microprocessor to pre-process instructions during cache miss cycles instead of stalling. The pre-processed instructions are used to generate instruction and data stream prefetches by detecting cache misses before they would otherwise occur by using the idle...

 mode, and continues to issue instructions from the thread that is waiting for memory. The thread of execution in run-ahead mode is known as a scout thread. When the data returns from memory, the processor restores the register file contents from the checkpoint, and switches back to normal execution mode.

The computation during run-ahead mode is discarded by the processor; nevertheless, scouting provides speedup because memory level parallelism
Memory level parallelism
Memory Level Parallelism or MLP is a term in computer architecture referring to the ability to have pending multiple memory operations, in particular cache misses or translation lookaside buffer misses, at the same time....

 (MLP) is increased. The cache lines brought into the cache hierarchy are often used by the processor again when it switches back to normal mode.

Rock processor scout

Sun's Rock processor
Rock processor
Rock was a multithreading, multicore, SPARC microprocessor developed at Sun Microsystems. Now canceled, it was a separate development from the CoolThreads/Niagara family of processors....

 (later canceled) used a form of hardware scout. However, any computations in run-ahead mode that do not depend on the cache miss may be retired immediately. This allows both prefetching and traditional instruction-level parallelism.

Scouting vs. SMT

Scouting and simultaneous multithreading
Simultaneous multithreading
Simultaneous multithreading, often abbreviated as SMT, is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading...

 (SMT) both use hardware threads to fight the memory wall. With scouting, the scout thread runs the instructions from the same instruction stream as the instruction that causes the pipeline stall. In the case of SMT, the SMT thread executes instruction in another context.

Thus, SMT increases the throughput of the processor while scouting increases the performance by lowering the number of cache misses.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK