Rate–distortion optimization
Encyclopedia
Rate–distortion optimization (RDO) is a method of improving video quality
in video compression. The name refers to the optimization of the amount of distortion (loss of video quality) against the amount of data required to encode the video, the rate. While it is primarily used by video encoders, rate-distortion optimization can be used to improve quality in any encoding situation (image, video, audio, or otherwise) where decisions have to be made that affect both file size and quality simultaneously.
,
and in particular regarding the use of quarter pixel-precision motion estimation
. Adding the extra precision to the motion of a block
during motion estimation might increase quality, but in some cases that extra quality isn't worth the extra bits necessary to encode the motion vector to a higher precision.
, a value representing the relationship between bit cost and quality for a particular quality level. The deviation from the source is usually measured as the mean squared error
, in order to maximize the PSNR video quality metric.
Calculating the bit cost is made more difficult by the entropy encoders
in modern video codecs, requiring the rate-distortion optimization algorithm to pass each block of video to be tested to the entropy coder to measure its actual bit cost. In MPEG codecs, the full process consists of a discrete cosine transform
, followed by quantization
and entropy encoding. Because of this, rate-distortion optimization is much slower than most other block-matching metrics, such as the simple sum of absolute differences
(SAD) and sum of absolute transformed differences
(SATD). As such it is usually used only for the final steps of the motion estimation process, such as deciding between different partition types in H.264/AVC.
Video quality
Video quality is a characteristic of a video passed through a video transmission/processing system, a formal or informal measure of perceived video degradation...
in video compression. The name refers to the optimization of the amount of distortion (loss of video quality) against the amount of data required to encode the video, the rate. While it is primarily used by video encoders, rate-distortion optimization can be used to improve quality in any encoding situation (image, video, audio, or otherwise) where decisions have to be made that affect both file size and quality simultaneously.
Background
The classical method of making encoding decisions is for the video encoder to choose the result which yields the highest quality output image. However, this has the disadvantage that the choice it makes might require more bits while giving comparatively little quality benefit. One common example of this problem is in motion estimationMotion estimation
Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D...
,
and in particular regarding the use of quarter pixel-precision motion estimation
Qpel
Quarter pixel refers to a quarter of a standard pixel. It is used in many modern video encoding standards such as MPEG-4 ASP and H.264/AVC to refer to quarter pixel precision in motion estimation and motion compensation...
. Adding the extra precision to the motion of a block
Macroblock
Macroblock is an image compression component and technique based on discrete cosine transform used on still images and video frames. Macroblocks are usually composed of two or more blocks of pixels. In the JPEG standard macroblocks are called MCU blocks....
during motion estimation might increase quality, but in some cases that extra quality isn't worth the extra bits necessary to encode the motion vector to a higher precision.
How it works
Rate–distortion optimization solves the aforementioned problem by acting as a video quality metric, measuring both the deviation from the source material and the bit cost for each possible decision outcome. The bits are mathematically measured by multiplying the bit cost by the LagrangianLagrange multipliers
In mathematical optimization, the method of Lagrange multipliers provides a strategy for finding the maxima and minima of a function subject to constraints.For instance , consider the optimization problem...
, a value representing the relationship between bit cost and quality for a particular quality level. The deviation from the source is usually measured as the mean squared error
Mean squared error
In statistics, the mean squared error of an estimator is one of many ways to quantify the difference between values implied by a kernel density estimator and the true values of the quantity being estimated. MSE is a risk function, corresponding to the expected value of the squared error loss or...
, in order to maximize the PSNR video quality metric.
Calculating the bit cost is made more difficult by the entropy encoders
Entropy encoding
In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium....
in modern video codecs, requiring the rate-distortion optimization algorithm to pass each block of video to be tested to the entropy coder to measure its actual bit cost. In MPEG codecs, the full process consists of a discrete cosine transform
Discrete cosine transform
A discrete cosine transform expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images A discrete cosine transform...
, followed by quantization
Quantization (image processing)
Quantization, involved in image processing, is a lossy compression technique achieved by compressing a range of values to a single quantum value. When the number of discrete symbols in a given stream is reduced, the stream becomes more compressible. For example, reducing the number of colors...
and entropy encoding. Because of this, rate-distortion optimization is much slower than most other block-matching metrics, such as the simple sum of absolute differences
Sum of absolute differences
Sum of absolute differences is a widely used, extremely simple algorithm for measuring the similarity between image blocks. It works by taking the absolute difference between each pixel in the original block and the corresponding pixel in the block being used for comparison...
(SAD) and sum of absolute transformed differences
Sum of absolute transformed differences
Sum of absolute transformed differences is a widely used video quality metric used for block-matching in motion estimation for video compression. It works by taking a frequency transform, usually a Hadamard transform, of the differences between the pixels in the original block and the...
(SATD). As such it is usually used only for the final steps of the motion estimation process, such as deciding between different partition types in H.264/AVC.
List of encoders that support RDO
- AtemeAtemeATEME is a French broadcast company specialising in video compression MPEG4 encoding/decoding solutions for contribution, video headend, and multi-screen transcoding as well as 3D technology...
H.264 encoder - Grass ValleyGrass Valley (company)Grass Valley, previously known as Grass Valley Group, is a privately held company based in California, USA. Grass Valley produces technology for the video and broadcast industry. On January 29, 2009, Thomson announced its intention to sell the Grass Valley business unit...
ViBE encoders (SD & HD MPEG-2/MPEG-4) - Harmonic Electra 8000 encoder (SD & HD MPEG-2/MPEG-4)
- libavcodecLibavcodeclibavcodec is a free software/open source LGPL-licensed library of codecs for encoding and decoding video and audio data. Same name but incompatible libraries are provided from both FFmpeg project and Libav project....
- MainConceptMainConceptMainConcept GmbH is a software company developing video/audio codecs and also applications and plug-ins related to video/audio encoding.-History:...
H.264 encoder - MicrosoftMicrosoftMicrosoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
VC-1VC-1VC-1 is the informal name of the SMPTE 421M video codec standard, which was initially developed as a proprietary video format by Microsoft before it was released as a formal SMPTE standard video format on April 3, 2006...
encoder - TANDBERG TelevisionTandberg TelevisionEricsson Television, formerly Tandberg Television, is a company providing MPEG-4 video on demand, and interactive television systems to telecommunications network operators and broadcasters. It was acquired by Swedish company, Ericsson in 2007, and was re-branded as Ericsson Television in 2010.The...
SD MPEG-2 EN8100 - TANDBERG TelevisionTandberg TelevisionEricsson Television, formerly Tandberg Television, is a company providing MPEG-4 video on demand, and interactive television systems to telecommunications network operators and broadcasters. It was acquired by Swedish company, Ericsson in 2007, and was re-branded as Ericsson Television in 2010.The...
HD MPEG-4 EN8190 - TANDBERG TelevisionTandberg TelevisionEricsson Television, formerly Tandberg Television, is a company providing MPEG-4 video on demand, and interactive television systems to telecommunications network operators and broadcasters. It was acquired by Swedish company, Ericsson in 2007, and was re-branded as Ericsson Television in 2010.The...
SD & HD MPEG-4 iPlex - TheoraTheoraTheora is a free lossy video compression format. It is developed by the Xiph.Org Foundation and distributed without licensing fees alongside their other free and open media projects, including the Vorbis audio format and the Ogg container....
1.1-alpha1 and later (the "Thusnelda" branch) - x264X264x264 is a free software library for encoding video streams into the H.264/MPEG-4 AVC format. It is released under the terms of the GNU General Public License.-History:...
H.264 encoder - XvidXviDXvid is a video codec library following the MPEG-4 standard, specifically MPEG-4 Part 2 Advanced Simple Profile . It uses ASP features such as b-frames, global and quarter pixel motion compensation, lumi masking, trellis quantization, and H.263, MPEG and custom quantization matrices.Xvid is a...
MPEG-4 ASP encoder