False sharing
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, false sharing is a performance degrading usage pattern that can arise in systems with distributed, coherent caches
Cache coherency
In computing, cache coherence refers to the consistency of data stored in local caches of a shared resource.When clients in a system maintain caches of a common memory resource, problems may arise with inconsistent data. This is particularly true of CPUs in a multiprocessing system...

 at the size of the smallest resource block managed by the caching mechanism. When a system participant attempts to periodically access data that will never be altered by another party, but that data shares a cache block with data that is altered, the caching protocol may force the first participant to reload the whole unit despite a lack of logical necessity. The caching system is unaware of activity within this block and forces the first participant to bear the caching system overhead required by true shared access of a resource.

By far the most common usage of this term is in modern multiprocessor
Multiprocessor
Computer system having two or more processing units each sharing main memory and peripherals, in order to simultaneously process programs.Sometimes the term Multiprocessor is confused with the term Multiprocessing....

 CPU cache
CPU cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

s, where memory is cached in lines of some small power of two
Power of two
In mathematics, a power of two means a number of the form 2n where n is an integer, i.e. the result of exponentiation with as base the number two and as exponent the integer n....

 word size (e.g., 64 aligned
Data structure alignment
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks...

, contiguous byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

s). If two processors operate on independent data in the same memory address
Memory address
A digital computer's memory, more specifically main memory, consists of many memory locations, each having a memory address, a number, analogous to a street address, at which computer programs store and retrieve, machine code or data. Most application programs do not directly read and write to...

 region storable in a single line, the cache coherency mechanisms in the system may force the whole line across the bus or interconnect with every data write, forcing memory stalls in addition to wasting system bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

. False sharing is an inherent artifact of automatically synchronized cache protocols and can also exist in environments such as distributed file system or databases, but current prevalence is limited to RAM caches.

Example


struct foo
{
int x, y;
};

foo f;

int sum_a
{
int s = 0;
for (int i = 0; i < 1000000; ++i)
s += f.x;
return s;
}

void inc_b
{
for (int i = 0; i < 1000000; ++i)
++f.y;
}


Here, sum_a may need to continually re-read x from main memory (instead of from cache) even though inc_b's modification of y should be irrelevant.

Tools

  • Acumem SlowSpotter and ThreadSpotter
    Acumem SlowSpotter
    Acumem SlowSpotter and Acumem ThreadSpotter are tools from Acumem, which diagnose performance problems related to data locality, cache utilization and thread interactions....

    , tools which diagnose performance problems related to data locality
    Locality of reference
    In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related storage locations being frequently accessed. There are two basic types of reference locality. Temporal locality refers to the reuse of specific data and/or resources...

    , cache utilization
    CPU cache
    A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

     and thread
    Thread (computer science)
    In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

    interactions.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK