Rate–distortion optimization - AbsoluteAstronomy.com

Rate–distortion optimization (RDO) is a method of improving video quality

Video quality

Video quality is a characteristic of a video passed through a video transmission/processing system, a formal or informal measure of perceived video degradation...

in video compression. The name refers to the optimization of the amount of distortion (loss of video quality) against the amount of data required to encode the video, the rate. While it is primarily used by video encoders, rate-distortion optimization can be used to improve quality in any encoding situation (image, video, audio, or otherwise) where decisions have to be made that affect both file size and quality simultaneously.

Background

The classical method of making encoding decisions is for the video encoder to choose the result which yields the highest quality output image. However, this has the disadvantage that the choice it makes might require more bits while giving comparatively little quality benefit. One common example of this problem is in motion estimation

Motion estimation

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D...

,

and in particular regarding the use of quarter pixel-precision motion estimation

Qpel

Quarter pixel refers to a quarter of a standard pixel. It is used in many modern video encoding standards such as MPEG-4 ASP and H.264/AVC to refer to quarter pixel precision in motion estimation and motion compensation...

. Adding the extra precision to the motion of a block

Macroblock

Macroblock is an image compression component and technique based on discrete cosine transform used on still images and video frames. Macroblocks are usually composed of two or more blocks of pixels. In the JPEG standard macroblocks are called MCU blocks....

during motion estimation might increase quality, but in some cases that extra quality isn't worth the extra bits necessary to encode the motion vector to a higher precision.

How it works

Rate–distortion optimization solves the aforementioned problem by acting as a video quality metric, measuring both the deviation from the source material and the bit cost for each possible decision outcome. The bits are mathematically measured by multiplying the bit cost by the Lagrangian

Lagrange multipliers

In mathematical optimization, the method of Lagrange multipliers provides a strategy for finding the maxima and minima of a function subject to constraints.For instance , consider the optimization problem...

, a value representing the relationship between bit cost and quality for a particular quality level. The deviation from the source is usually measured as the mean squared error

Mean squared error

In statistics, the mean squared error of an estimator is one of many ways to quantify the difference between values implied by a kernel density estimator and the true values of the quantity being estimated. MSE is a risk function, corresponding to the expected value of the squared error loss or...

, in order to maximize the PSNR video quality metric.

Calculating the bit cost is made more difficult by the entropy encoders

Entropy encoding

In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium....

in modern video codecs, requiring the rate-distortion optimization algorithm to pass each block of video to be tested to the entropy coder to measure its actual bit cost. In MPEG codecs, the full process consists of a discrete cosine transform

Discrete cosine transform

A discrete cosine transform expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images A discrete cosine transform...

, followed by quantization

Quantization (image processing)

Quantization, involved in image processing, is a lossy compression technique achieved by compressing a range of values to a single quantum value. When the number of discrete symbols in a given stream is reduced, the stream becomes more compressible. For example, reducing the number of colors...

and entropy encoding. Because of this, rate-distortion optimization is much slower than most other block-matching metrics, such as the simple sum of absolute differences

Sum of absolute differences

Sum of absolute differences is a widely used, extremely simple algorithm for measuring the similarity between image blocks. It works by taking the absolute difference between each pixel in the original block and the corresponding pixel in the block being used for comparison...

(SAD) and sum of absolute transformed differences

Sum of absolute transformed differences

Sum of absolute transformed differences is a widely used video quality metric used for block-matching in motion estimation for video compression. It works by taking a frequency transform, usually a Hadamard transform, of the differences between the pixels in the original block and the...

(SATD). As such it is usually used only for the final steps of the motion estimation process, such as deciding between different partition types in H.264/AVC.

List of encoders that support RDO

Ateme
Ateme
ATEME is a French broadcast company specialising in video compression MPEG4 encoding/decoding solutions for contribution, video headend, and multi-screen transcoding as well as 3D technology...

H.264 encoder
Grass Valley
Grass Valley (company)
Grass Valley, previously known as Grass Valley Group, is a privately held company based in California, USA. Grass Valley produces technology for the video and broadcast industry. On January 29, 2009, Thomson announced its intention to sell the Grass Valley business unit...

ViBE encoders (SD & HD MPEG-2/MPEG-4)
Harmonic Electra 8000 encoder (SD & HD MPEG-2/MPEG-4)
libavcodec
Libavcodec
libavcodec is a free software/open source LGPL-licensed library of codecs for encoding and decoding video and audio data. Same name but incompatible libraries are provided from both FFmpeg project and Libav project....
MainConcept
MainConcept
MainConcept GmbH is a software company developing video/audio codecs and also applications and plug-ins related to video/audio encoding.-History:...

H.264 encoder
Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

VC-1
VC-1
VC-1 is the informal name of the SMPTE 421M video codec standard, which was initially developed as a proprietary video format by Microsoft before it was released as a formal SMPTE standard video format on April 3, 2006...

encoder
TANDBERG Television
Tandberg Television
Ericsson Television, formerly Tandberg Television, is a company providing MPEG-4 video on demand, and interactive television systems to telecommunications network operators and broadcasters. It was acquired by Swedish company, Ericsson in 2007, and was re-branded as Ericsson Television in 2010.The...

SD MPEG-2 EN8100
TANDBERG Television
Tandberg Television
Ericsson Television, formerly Tandberg Television, is a company providing MPEG-4 video on demand, and interactive television systems to telecommunications network operators and broadcasters. It was acquired by Swedish company, Ericsson in 2007, and was re-branded as Ericsson Television in 2010.The...

HD MPEG-4 EN8190
TANDBERG Television
Tandberg Television
Ericsson Television, formerly Tandberg Television, is a company providing MPEG-4 video on demand, and interactive television systems to telecommunications network operators and broadcasters. It was acquired by Swedish company, Ericsson in 2007, and was re-branded as Ericsson Television in 2010.The...

SD & HD MPEG-4 iPlex
Theora
Theora
Theora is a free lossy video compression format. It is developed by the Xiph.Org Foundation and distributed without licensing fees alongside their other free and open media projects, including the Vorbis audio format and the Ogg container....

1.1-alpha1 and later (the "Thusnelda" branch)
x264
X264
x264 is a free software library for encoding video streams into the H.264/MPEG-4 AVC format. It is released under the terms of the GNU General Public License.-History:...

H.264 encoder
Xvid
XviD
Xvid is a video codec library following the MPEG-4 standard, specifically MPEG-4 Part 2 Advanced Simple Profile . It uses ASP features such as b-frames, global and quarter pixel motion compensation, lumi masking, trellis quantization, and H.263, MPEG and custom quantization matrices.Xvid is a...

MPEG-4 ASP encoder

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.