Sum of absolute differences - AbsoluteAstronomy.com

Sum of absolute differences (SAD) is a widely used, extremely simple algorithm for measuring the similarity between image blocks

Macroblock

Macroblock is an image compression component and technique based on discrete cosine transform used on still images and video frames. Macroblocks are usually composed of two or more blocks of pixels. In the JPEG standard macroblocks are called MCU blocks....

. It works by taking the absolute difference

Absolute difference

The absolute difference of two real numbers x, y is given by |x − y|, the absolute value of their difference. It describes the distance on the real line between the points corresponding to x and y...

between each pixel

Pixel

In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

in the original block and the corresponding pixel in the block being used for comparison. These differences are summed to create a simple metric of block similarity, the L¹ norm

Lp space

In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces...

of the difference image.

The sum of absolute differences may be used for a variety of purposes, such as object recognition, the generation of disparity maps

Binocular disparity

Binocular disparity refers to the difference in image location of an object seen by the left and right eyes, resulting from the eyes' horizontal separation. The brain uses binocular disparity to extract depth information from the two-dimensional retinal images in stereopsis...

for stereo

Computer stereo vision

Computer stereo vision is the extraction of 3D information from digital images, such as obtained by a CCD camera. By comparing information about a scene from two vantage points, 3D information can be extracted by examination of the relative positions of objects in the two panels...

images, and motion estimation

Motion estimation

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D...

for video compression.

Example

This example uses the sum of absolute differences to identify which part of a search image is most similar to a template image. In this example, the template image is 3 by 3 pixels in size, while the search image is 3 by 5 pixels in size. Each pixel is represented by a single integer

Integer

The integers are formed by the natural numbers together with the negatives of the non-zero natural numbers .They are known as Positive and Negative Integers respectively...

from 0 to 9.



Template    Search image

 2 5 5       2 7 5 8 6

 4 0 7       1 7 4 2 7

 7 5 9       8 4 6 8 5

There are exactly three unique locations within the search image where the template may fit: the left side of the image, the center of the image, and the right side of the image. To calculate the SAD values, the absolute value of the difference between each corresponding pair of pixels is used: the difference between 2 and 2 is 0, 4 and 1 is 3, 7 and 8 is 1, and so forth.

Calculating the SAD values for each of these locations gives the following:



Left    Center   Right

0 2 0   5 0 3    3 3 1

3 7 3   3 4 5    0 2 0

1 1 3   3 1 1    1 3 4

For each of these three image patches, the absolute differences are added together, giving a SAD value of 20, 25, and 17, respectively. From these SAD values, it is apparent that the right side of the search image is the most similar to the template image, because it has the least difference as compared to the other locations.

Object recognition

The sum of absolute differences provides a simple way to automate the searching for objects inside an image, but may be unreliable due to the effects of contextual factors such as changes in lighting, color, viewing direction, size, or shape. The SAD may be used in conjunction with other object recognition methods, such as edge detection

Edge detection

Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

, to improve the reliability of results.

Video compression

SAD is an extremely fast metric due to its simplicity; it is effectively the simplest possible metric that takes into account every pixel

Pixel

in a block. Therefore it is very effective for a wide motion search of many different blocks. SAD is also easily parallelizable

Parallel processing

Parallel processing is the ability to carry out multiple operations or tasks simultaneously. The term is used in the contexts of both human cognition, particularly in the ability of the brain to simultaneously process incoming stimuli, and in parallel computing by machines.-Parallel processing by...

since it analyzes each pixel separately, making it easily implementable with such instructions as MMX and SSE2

SSE2

SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

. For example, SSE has packed sum of absolute differences instruction (PSADBW) specifically for this purpose. Once candidate blocks are found, the final refinement of the motion estimation process is often done with other slower but more accurate metrics, which better take into account human perception

Visual system

The visual system is the part of the central nervous system which enables organisms to process visual detail, as well as enabling several non-image forming photoresponse functions. It interprets information from visible light to build a representation of the surrounding world...

. These include the sum of absolute transformed differences

Sum of absolute transformed differences

Sum of absolute transformed differences is a widely used video quality metric used for block-matching in motion estimation for video compression. It works by taking a frequency transform, usually a Hadamard transform, of the differences between the pixels in the original block and the...

(SATD), the sum of squared differences (SSD), and rate-distortion optimization.

Example

Object recognition

Video compression

See also