CELT - AbsoluteAstronomy.com

Constrained Energy Lapped Transform (CELT) is an open, royalty-free audio compression

Audio compression

Audio compression may refer to:*Audio compression , a type of lossy compression in which the amount of data in a recorded waveform is reduced for transmission with some loss of quality, used in CD and MP3 encoding, Internet radio, and the like...

format and a free software

Free software

Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

codec

Audio codec

All codecs are devices or computer programs capable of coding or decoding a digital data stream or signal.The term audio codec has two meanings depending on the context:...

with especially low algorithmic delay for use in low-latency audio

Sound

Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations.-Propagation of...

communication. It is a lossy

Lossy data compression

In information technology, "lossy" compression is a data encoding method that compresses data by discarding some of it. The procedure aims to minimize the amount of data that need to be held, handled, and/or transmitted by a computer...

codec, meaning quality is permanently degraded to reduce file size.
The algorithms are openly documented and may be used free of software patent

Software patent

Software patent does not have a universally accepted definition. One definition suggested by the Foundation for a Free Information Infrastructure is that a software patent is a "patent on any performance of a computer realised by means of a computer program".In 2005, the European Patent Office...

restrictions. It is being developed by the Xiph.Org Foundation

Xiph.Org Foundation

Xiph.Org Foundation is a non-profit organizationthat produces free multimedia formats and software tools. It focuses on the Ogg family of formats, the most successful of which has been Vorbis, an open and freely licensed audio format and codec designed to compete with the patented MP3 and AAC...

(as part of the Ogg

Ogg

Ogg is a free, open container format maintained by the Xiph.Org Foundation. The creators of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high quality digital multimedia.The Ogg container format can multiplex...

codec family) and the codec working group of the Internet Engineering Task Force

Internet Engineering Task Force

The Internet Engineering Task Force develops and promotes Internet standards, cooperating closely with the W3C and ISO/IEC standards bodies and dealing in particular with standards of the TCP/IP and Internet protocol suite...

(IETF).
Reference implementation

Reference implementation

In the software development process, a reference implementation is the standard from which all other implementations, with their attendant customizations, are measured, and to which all improvements are added...

is a software library named libcelt, that is written in the programming language C

C (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

and published as free software under Xiph's own 3-clause BSD-ish license.

CELT is meant to bridge the gap between Vorbis

Vorbis

Vorbis is a free software / open source project headed by the Xiph.Org Foundation . The project produces an audio format specification and software implementation for lossy audio compression...

and Speex

Speex

Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised BSD...

for applications where both high quality audio and low delay are desired. It is suitable to carry both speech and music. It borrows ideas from the CELP algorithm, but avoids some of its limitations by operating in the frequency domain

Frequency domain

In electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....

exclusively.

The first development version of CELT was published in December 2007.

Properties

CELT can use sampling rate

Sampling rate

The sampling rate, sample rate, or sampling frequency defines the number of samples per unit of time taken from a continuous signal to make a discrete signal. For time-domain signals, the unit for sampling rate is hertz , sometimes noted as Sa/s...

s from 32 kHz to 48 kHz and above, adaptive bit-rate from 32 kbit/s to 128 kbit/s per channel and above. CELT supports mono and stereo and it is applicable to both speech and music. It uses ultra-low algorithmic delay (as low as 2 ms; scalable, typically from 3 to 9 ms).
There are no known intellectual property issues and it is permissive open-source licensed under the 2-clause BSD

BSD licenses

BSD licenses are a family of permissive free software licenses. The original license was used for the Berkeley Software Distribution , a Unix-like operating system after which it is named....

.

The goal is a codec for real-time applications. Therefore, the central feature is low algorithmic delay. CELT allows for latencies of typically three to nine, but configurable to below two milliseconds at the price of more bitrate to reach a similar audio quality. CELT undercuts the latencies that are possible with other commonly used codecs.

Like its sister project Vorbis

Vorbis

Vorbis is a free software / open source project headed by the Xiph.Org Foundation . The project produces an audio format specification and software implementation for lossy audio compression...

it is a fullband (entire human hearing range

Hearing range

For more detail on human hearing see Audiogram, Equal loudness contours and Hearing impairment.Hearing range usually describes the range of frequencies that can be heard by an animal or human, though it can also refer to the range of levels...

) general-purpose codec, i.e. not specialised for special types of audio signals and therefore different from its other sister project Speex

Speex

. It processes audio signals with sampling rate

Sampling rate

s between 32 and 96 kHz and up to two channels (stereophonic sound

Stereophonic sound

The term Stereophonic, commonly called stereo, sound refers to any method of sound reproduction in which an attempt is made to create an illusion of directionality and audible perspective...

). Therefore the format basically enables for transparent

Transparency (data compression)

In data compression or psychoacoustics, transparency is the ideal result of lossy data compression. If a lossy compressed result is perceptually indistinguishable from the uncompressed input, then the compression can be declared to be transparent...

results, as well as for bitrates down to 24 kBit/s. All in all, the compression capabilities are said to be significantly superior to those of the MP3

MP3

MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...

format. As another useful feature for realtime applications like telephony, CELT performs very well at low bitrates. The audio quality there is said to be superior to Vorbis and even on par with HE-AACv1, thanks to the band folding. In comparative double-blind listening tests it proved to be noticeably superior to HE-AACv1 at ~64 kBit/s.

It has a comparably low computational complexity that resembles that of the low-delay variant of AAC

Advanced Audio Coding

Advanced Audio Coding is a standardized, lossy compression and encoding scheme for digital audio. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at similar bit rates....

(AAC-LD) and stays significantly below the complexity of Vorbis.

It enables for constant and variable bitrate. If the signal disappears into the noise floor in speech pauses and similar cases, the transmission can be limited to signal the output of comfort noise

Comfort noise

Comfort noise is synthetic background noise used in radio and wireless communications to fill the artificial silence in a transmission resulting from voice activity detection or from the audio clarity of modern digital lines....

to the decoder. Most settings of the naturally streaming-enabled format can be changed on the fly without interrupting transmission.

The format is robust to transmission errors. Loss of whole packets as well as bit errors can be masked with a steady degradation of audio quality (packet loss concealment

Packet Loss Concealment

Packet loss concealment is a technique to mask the effects of packet loss in VoIP communications. Because the voice signal is sent as packets on a VoIP network, they may travel different routes to get to destination. At the receiver a packet might arrive very late, corrupted or simply might not...

, PLC).

Technology

CELT is a transform codec

Transform coding

Transform coding is a type of data compression for "natural" data like audio signals or photographic images. The transformation is typically lossy, resulting in a lower quality copy of the original input....

based on the modified discrete cosine transform

Modified discrete cosine transform

The modified discrete cosine transform is a Fourier-related transform based on the type-IV discrete cosine transform , with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset,...

(MDCT) and concepts from CELP (with a code book for excitation, but in the frequency domain).

The initial PCM-coded

Pulse-code modulation

Pulse-code modulation is a method used to digitally represent sampled analog signals. It is the standard form for digital audio in computers and various Blu-ray, Compact Disc and DVD formats, as well as other uses such as digital telephone systems...

signal is being cut into relatively small, overlapping blocks for the MDCT (window function

Window function

In signal processing, a window function is a mathematical function that is zero-valued outside of some chosen interval. For instance, a function that is constant inside the interval and zero elsewhere is called a rectangular window, which describes the shape of its graphical representation...

) and transformed to frequency coefficients. Choosing an especially short block size on the one hand enables for a low latency, but also leads to poor frequency resolution that has to be compensated. For a further reduction of the algorithmic delay to the expense of a minor sacrifice in audio quality, the by nature 50 % of overlap between the blocks is practically cut down to half by silencing the signal during one eight at both ends of a block, respectively.

The coefficients are grouped to resemble the critical bands of the human auditory system. The entire amount of energy of each group is analysed and the values quantised

Quantization

Quantization is the procedure of constraining something from a relatively large or continuous set of values to a relatively small discrete set...

for data reduction

Data reduction

Data Reduction is the transformation of numerical or alphabetical digital information derived empirical or experimentally into a corrected, ordered, and simplified form....

and compressed through prediction by only transmitting the difference to the predicted values (delta encoding

Delta encoding

Delta encoding is a way of storing or transmitting data in the form of differences between sequential data rather than complete files; more generally this is known as data differencing...

).

The (unquantised) band energy values are removed from the raw DCT coefficients (normalisation). The coefficients of the resulting residual signal (so-called “band shape”) are coded by Pyramid Vector Quantisation (PVQ, a spherical vector quantisation). This encoding leads to code words of fixed (predictable) length, which in turn enables for robustness against bit errors and leaves no need for entropy encoding

Entropy encoding

In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium....

. Finally, all output of the encoder are coded to one bitstream by a range encoder

Range encoding

Range encoding is a data compression method defined by G. Nigel N. Martin in a 1979 paper Range encoding is a form of arithmetic coding that was historically of interest for avoiding some patents on particular later-developed arithmetic coding techniques...

. In connection with the PVQ, CELT uses a technique known as band folding, is said to deliver a similar effect to the spectral band replication

Spectral band replication

Spectral band replication is a technology to enhance audio or speech codecs, especially at low bit rates and is based on harmonic redundancy in the frequency domain....

(SBR) by reusing coefficients of lower bands for higher ones, while at the same time it has much less implications on the algorithmic delay and computational complexity than the SBR. This works against “birdie” artifacts by preserving more richness in the appropriate frequency bands.

The decoder unpacks the individual components from the range coded bitstream, multiplies the band energy to the band shape coefficients and transforms them back (via iMDCT) to PCM data. The individual blocks are rejoined using weighted overlap-add (WOLA).
Many parameters are not explicitly coded, but instead reconstructed by using the same functions as the encoder.

For the channel coupling

Joint (audio engineering)

In audio engineering, joint refers to a joining of several channels of similar information in order to obtain higher quality, a smaller file size, or both.-Joint stereo:...

CELT may use M/S stereo or intensity stereo

Intensity stereo

Intensity stereo or Intensity stereophony is the technique used by a stereo sound image that is produced only by level differences in between the left and right loudspeakers, rather than arrival time differences. Also known as a set up of two microphones from across from each other so you could...

.
Blocks can be described independent from adjacent frames (Intra-frame

Intra-frame

Intra-frame coding is used in video coding . It is part of group of pictures with inter frames.The term intra-frame coding refers to the fact that the various lossless and lossy compression techniques are performed relative to information that is contained only within the current frame, and not...

); for example to enable a decoder to jump into a running stream.
With transform codecs so-called pre-echo artifacts can get audible, because the quantisation error of sharp, energy-heavy sounds (transients) can spread over the entire DCT block and the transient doesn't mask them backward in time as well as forward. With CELT each block can be further divided to thwart such artifacts.

History

First work on plans and drafts for a Vorbis successor was done in 2005 at Xiph as part of the Ghost project (initially talked about as “Vorbis II”). Besides the codec plans of Vorbis creator Christopher Montgomery, that are on halt in favour of Theora development, this also led to Jean-Marc Valin′s concept of a particularly low-latency codec. Valin is working on CELT since 2007 and on 29. November he entered first code in the repository of the project. In December 2007 the first developers version 0.0.1 got published, first named “Code-Excited Lapped Transform”. CELT is a proposal for a free codec standard for telecommunication over the internet at the IETF

Internet Engineering Task Force

since Juli 2009, thereby now also involving the codec working group of the IETF in the development. In May 2009, a draft of RTP
Real-time Transport Protocol
The Real-time Transport Protocol defines a standardized packet format for delivering audio and video over IP networks. RTP is used extensively in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications, television services and...

payload format for the CELT Codec was published.

As of version 0.9, the pitch prediction operating in the frequency domain, that was used so far, was replaced by a less complex solution with a pre- and postfilter pair in time domain, that was contributed by Raymond Chen of Broadcom

Broadcom

Broadcom Corporation is a fabless semiconductor company in the wireless and broadband communication business. The company is headquartered in Irvine, California, USA. Broadcom was founded by a professor-student pair Henry Samueli and Henry T. Nicholas III from the University of California, Los...

.

With CELT 0.11 from February 4, 2011 the format was tentatively frozen (“soft freeze”) – reserving the possibility of unexpectedly necessary last changes.

Despite the format not being finally frozen it is being used in the VoIP

Voice over IP

Voice over Internet Protocol is a family of technologies, methodologies, communication protocols, and transmission techniques for the delivery of voice communications and multimedia sessions over Internet Protocol networks, such as the Internet...

applications Ekiga
Ekiga
Ekiga /i k ai g a/ is a VoIP and video conferencing application for GNOME and Windows. It is distributed as free software under the terms of the GNU General Public License. It was the default VoIP client in Ubuntu until October 2009, when it was replaced by Empathy...

and FreeSWITCH
Freeswitch
FreeSWITCH is a free and open source communications software for the creation of voice and messaging products. It is licensed under the Mozilla Public License , a free software license...

since January 2009 and meanwhile also Mumble, TeamSpeak

TeamSpeak

TeamSpeak is a proprietary Voice over IP software that allows users to speak on a chat channel with other users, much like a telephone conference call. A TeamSpeak user will often wear a headset with an integrated microphone...

and other software.

Shortly after the advent of the hybrid codec Opus
Opus (codec)
Opus is a low-delay wideband codec intended for applications such as VoIP that will eventually be royalty-free. Opus incorporates technology from the speech-oriented SILK codec and the low-latency CELT codec...

(formerly known as “Harmony”), the development of CELT as a separate project was halted, instead it lives on as basis of Opus and is now being developed as a part of this successor project. Opus represents a superset to CELT and the speech codec SILK
Silk
Silk is a natural protein fiber, some forms of which can be woven into textiles. The best-known type of silk is obtained from the cocoons of the larvae of the mulberry silkworm Bombyx mori reared in captivity...

, in which the CELT algorithms are either not used, used on their own, or used in hybrid with the SILK algorithms treating the lower part of the spectral range and the CELT algorithms being used for the high part of the frequency range. The appropriate draft is at the IETF since September 2010.

In April support for CELT was included in FFmpeg

FFmpeg

FFmpeg is a free software project that produces libraries and programs for handling multimedia data. The most notable parts of FFmpeg are libavcodec, an audio/video codec library used by several other projects, libavformat, an audio/video container mux and demux library, and the ffmpeg command line...

Software

In January 2009 support for CELT was added to the Ekiga

Ekiga

Ekiga /i k ai g a/ is a VoIP and video conferencing application for GNOME and Windows. It is distributed as free software under the terms of the GNU General Public License. It was the default VoIP client in Ubuntu until October 2009, when it was replaced by Empathy...

and FreeSWITCH

Freeswitch

FreeSWITCH is a free and open source communications software for the creation of voice and messaging products. It is licensed under the Mozilla Public License , a free software license...

VoIP programs.

CELT is also supported or used by:

Gablarski
GStreamer
GStreamer
GStreamer is a pipeline-based multimedia framework written in the C programming language with the type system based on GObject.GStreamer allows a programmer to create a variety of media-handling components, including simple audio playback, audio and video playback, recording, streaming and editing...
jack-audio-connection-kit
JACK Audio Connection Kit
JACK is a professional sound server daemon that provides real-time, low latency connections for both audio and MIDI data between applications that implement its API...

(netjack)
liboggz
Mumble (starting with version 1.2)
NexGenVoIP
Radio CHNC
RoarAudio
SFLphone
SFLphone
SFLphone is SIP/IAX2 compatible softphone for Linux. SFLphone is free software released under the GNU General Public License. Packages are available for all major distributions including Debian, openSUSE, Fedora, Mandriva and the latest Ubuntu releases....
Soundjack
TeamSpeak
TeamSpeak
TeamSpeak is a proprietary Voice over IP software that allows users to speak on a chat channel with other users, much like a telephone conference call. A TeamSpeak user will often wear a headset with an integrated microphone...

3
SPICE
SPICE (protocol)
In computing, SPICE is a remote-display system built for virtual environments which allows users to view a computing "desktop" environment - not only on its compute-server machine, but also from anywhere on the Internet and using a wide variety of machine architectures.Qumranet originally...

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.