Code Excited Linear Prediction
Encyclopedia
Code-excited linear prediction (CELP) is a speech coding
Speech coding
Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...

 algorithm originally proposed by M.R. Schroeder and B.S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction and linear predictive coding
Linear predictive coding
Linear predictive coding is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model...

 vocoders (e.g., FS-1015
FS-1015
FS-1015 is a secure telephony speech encoding standard developed by the United States Department of Defense and later by NATO. It is also known as LPC-10 and STANAG 4198....

). Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction
Vector Sum Excited Linear Prediction
Vector sum excited linear prediction is a speech coding method used in several cellular standards. The VSELP algorithm is an analysis-by-synthesis coding technique and belongs to the class of speech coding algorithms known as CELP .Variations of this codec have been used in several 2G cellular...

, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.

Introduction

The CELP algorithm is based on four main ideas:
  • Using the source-filter model of speech production
    Source-filter model of speech production
    The source–filter model of speech production models speech as a combination of a sound source, such as the vocal cords, and a linear acoustic filter, the vocal tract . An important assumption that is often made in the use of the source-filter model is the independence of source and filter...

     through linear prediction
    Linear prediction
    Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples....

     (LP)(see the textbook "speech coding algorithm");
  • Using an adaptive and a fixed codebook as the input (excitation) of the LP model;
  • Performing a search in closed-loop in a “perceptually weighted domain”.
  • Applying vector quantization
    Vector quantization
    Vector quantization is a classical quantization technique from signal processing which allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points into groups having...

     (VQ)


The original algorithm as simulated in 1983 by Schroeder and Atal required 150 seconds to encode 1 second of speech when run on a Cray-1
Cray-1
The Cray-1 was a supercomputer designed, manufactured, and marketed by Cray Research. The first Cray-1 system was installed at Los Alamos National Laboratory in 1976, and it went on to become one of the best known and most successful supercomputers in history...

 supercomputer. Since then, more efficient ways of implementing the codebooks and improvements in computing capabilities have made it possible to run the algorithm in embedded devices, such as mobile phones.

CELP decoder

Before exploring the complex encoding process of CELP we introduce the decoder here. Figure 1 describes a generic CELP decoder. The excitation is produced by summing the contributions from an adaptive (aka pitch) codebook and a stochastic (aka innovation or fixed) codebook:


where is the adaptive (pitch
Pitch (music)
Pitch is an auditory perceptual property that allows the ordering of sounds on a frequency-related scale.Pitches are compared as "higher" and "lower" in the sense associated with musical melodies,...

) codebook contribution and is the stochastic (innovation or fixed) codebook contribution. The fixed codebook is a vector quantization
Vector quantization
Vector quantization is a classical quantization technique from signal processing which allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points into groups having...

 dictionary that is (implicitly or explicitly) hard-coded into the codec. This codebook can be algebraic (ACELP
ACELP
Algebraic code-excited linear prediction is a patented speech coding algorithm by VoiceAge Corporation in which a limited set of pulses is distributed as excitation to linear prediction filter....

) or be stored explicitly (e.g. Speex
Speex
Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised BSD...

). The entries in the adaptive codebook consist of delayed versions of the excitation. This makes it possible to efficiently code periodic signals, such as voiced sounds.

The filter that shapes the excitation has an all-pole model of the form , where is called the prediction filter and is obtained using linear prediction (Levinson–Durbin algorithm
Levinson recursion
Levinson recursion or Levinson-Durbin recursion is a procedure in linear algebra to recursively calculate the solution to an equation involving a Toeplitz matrix...

). An all-pole filter is used because it is a good representation of the human vocal tract and because it is easy to compute.

CELP encoder

The main principle behind CELP is called Analysis-by-Synthesis (AbS) and means that the encoding (analysis) is performed by perceptually optimizing the decoded (synthesis) signal in a closed loop. In theory, the best CELP stream would be produced by trying all possible bit combinations and selecting the one that produces the best-sounding decoded signal. This is obviously not possible in practice for two reasons: the required complexity is beyond any currently available hardware and the “best sounding” selection criterion implies a human listener.

In order to achieve real-time encoding using limited computing resources, the CELP search is broken down into smaller, more manageable, sequential searches using a simple perceptual weighting function. Typically, the encoding is performed in the following order:
  • Linear Prediction Coefficients (LPC) are computed and quantized, usually as LSPs
    Line spectral pairs
    Line spectral pairs or line spectral frequencies are used to represent linear prediction coefficients for transmission over a channel. LSPs have several properties that make them superior to direct quantization of LPCs...

  • The adaptive (pitch) codebook is searched and its contribution removed
  • The fixed (innovation) codebook is searched

Noise weighting

Most (if not all) modern audio codecs attempt to shape the coding noise
Psychoacoustics
Psychoacoustics is the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound...

 so that it appears mostly in the frequency regions where the ear cannot detect it. For example, the ear is more tolerant to noise in parts of the spectrum that are louder and vice versa. That's why instead of minimizing the simple quadratic error, CELP minimizes the error for the perceptually weighted domain. The weighting filter W(z) is typically derived from the LPC filter by the use of bandwidth expansion
Bandwidth expansion
Bandwidth expansion is a technique for widening the bandwidth or the resonances in an LPC filter. This is done by moving all the poles towards the origin by a constant factor \gamma...

:


where .

See also

  • MPEG-4 Part 3
    MPEG-4 Part 3
    MPEG-4 Part 3 or MPEG-4 Audio is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods...

     (CELP as an MPEG-4 Audio Object Type)
  • G.728
    G.728
    G.728 is an ITU-T standard for speech coding operating at 16 kbit/s. It is officially described as Coding of speech at 16 kbit/s using low-delay code excited linear prediction....

     - Coding of speech at 16 kbit/s using low-delay code excited linear prediction
  • G.718
    G.718
    G.718 is an ITU-T recommendation embedded scalable speech and audio codec providing high quality narrowband speech over the lower bit rates and high quality wideband speech over the complete range of bit rates...

     - uses CELP for the lower two layers for the band (50–6400 Hz) in a two stage coding structure
  • G.729.1
    G.729.1
    G.729.1 is an 8-32 kbit/s embedded speech and audio codec providing bitstream interoperability with G.729, G.729 Annex A and G.729 Annex B. Its official name is G.729-based embedded variable bit rate codec: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729.This codec has...

     - uses CELP coding for the lower band (50–4000 Hz) in a three-stage coding structure
  • Comparison of audio codecs
    Comparison of audio codecs
    The following tables compare general and technical information for a variety of audio formats and audio compression formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test....


External links

  • This is based on a paper presented at Linux.Conf.Au
  • Some parts based on the Speex
    Speex
    Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised BSD...

    codec manual
  • reference implementations of CELP 1016A (CELP 3.2a) and LPC 10e.
  • Linear Predictive Coding (LPC)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK