Adaptive Multi-Rate
Encyclopedia
The Adaptive Multi-Rate audio codec is a patented audio data compression scheme optimized for speech coding
. AMR was adopted as the standard speech codec
by 3GPP
in October 1999 and is now widely used in GSM and UMTS. It uses link adaptation
to select from one of eight different bit rates based on link conditions.
AMR is also a file format for storing spoken audio using the AMR codec. Many modern mobile telephone handsets can store short audio recordings in the AMR format, and both free
and proprietary programs exist (see Software support) to convert between this and other formats, although it should be remembered that AMR is a speech format and is unlikely to give ideal results for other audio. The common filename extension
is
container format based on ISO base media file format.
, VAD
and CNG
. The usage of AMR requires optimized link adaptation that selects the best codec mode to meet the local radio channel and capacity requirements. If the radio conditions are bad, source coding
is reduced and channel coding is increased. This improves the quality and robustness of the network connection while sacrificing some voice clarity. In the particular case of AMR this improvement is somewhere around S/N = 4-6 dB for usable communication. The new intelligent system allows the network operator to prioritize capacity or quality per base station.
There are a total of 14 modes of the AMR codec, 8 are available in a full rate channel (FR) and 6 on a half rate channel (HR).
s of Nokia Corporation, Telefonaktiebolaget L. M. Ericsson, VoiceAge Corporation and Nippon Telegraph and Telephone
Corporation. VoiceAge Corporation is the License Administrator for the AMR and AMR-WB+ patent pool
s. VoiceAge also accepts submission of patents for determination of their possible essentiality to these standards.
The initial fee for professional content creation tools and "real-time channel" products is $6,500. The minimum annual royalty shall be $10,000, excluding the initial fee in year 1 of the license agreement.
AMR decoder in a category of personal computer products (e.g. media players) is licensed for free. The license fee for a sold encoder is $0.40. The minimum annual royalty will not apply to licensed products which fall under category of personal computer products and which contain only the free decoder.
For more information about this, please refer to:
Speech coding
Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...
. AMR was adopted as the standard speech codec
Codec
A codec is a device or computer program capable of encoding or decoding a digital data stream or signal. The word codec is a portmanteau of "compressor-decompressor" or, more commonly, "coder-decoder"...
by 3GPP
3GPP
The 3rd Generation Partnership Project is a collaboration between groups of telecommunications associations, known as the Organizational Partners...
in October 1999 and is now widely used in GSM and UMTS. It uses link adaptation
Link adaptation
Link adaptation, or adaptive coding and modulation , is a term used in wireless communications to denote the matching of the modulation, coding and other signal and protocol parameters to the conditions on the radio link Link adaptation, or adaptive coding and modulation (ACM), is a term used in...
to select from one of eight different bit rates based on link conditions.
AMR is also a file format for storing spoken audio using the AMR codec. Many modern mobile telephone handsets can store short audio recordings in the AMR format, and both free
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...
and proprietary programs exist (see Software support) to convert between this and other formats, although it should be remembered that AMR is a speech format and is unlikely to give ideal results for other audio. The common filename extension
Filename extension
A filename extension is a suffix to the name of a computer file applied to indicate the encoding of its contents or usage....
is
.amr
. There also exists another storage format for AMR that is suitable for applications with more advanced demands on the storage format, like random access or synchronization with video. This format is the 3GPP-specified 3GP3GP
3GP is a multimedia container format defined by the Third Generation Partnership Project for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones....
container format based on ISO base media file format.
Usage
The frames contain 160 samples and are 20 milliseconds long. AMR uses various techniques, such as ACELP, DTXDiscontinuous Transmission
Discontinuous transmission is a means by which a mobile telephone is temporarily shut off or muted while the phone lacks a voice input.-Misconception:...
, VAD
Voice activity detection
Voice activity detection , also known as speech activity detection or speech detection, is a technique used in speech processing in which the presence or absence of human speech is detected.. The main uses of VAD are in speech coding and speech recognition...
and CNG
Comfort noise
Comfort noise is synthetic background noise used in radio and wireless communications to fill the artificial silence in a transmission resulting from voice activity detection or from the audio clarity of modern digital lines....
. The usage of AMR requires optimized link adaptation that selects the best codec mode to meet the local radio channel and capacity requirements. If the radio conditions are bad, source coding
Source coding
In information theory, Shannon's source coding theorem establishes the limits to possible data compression, and the operational meaning of the Shannon entropy....
is reduced and channel coding is increased. This improves the quality and robustness of the network connection while sacrificing some voice clarity. In the particular case of AMR this improvement is somewhere around S/N = 4-6 dB for usable communication. The new intelligent system allows the network operator to prioritize capacity or quality per base station.
There are a total of 14 modes of the AMR codec, 8 are available in a full rate channel (FR) and 6 on a half rate channel (HR).
Mode | Bitrate (kbit/s) | Channel | Compatible with |
---|---|---|---|
AMR_12.20 | 12.20 | FR | ETSI GSM enhanced full rate Enhanced Full Rate Enhanced Full Rate or EFR or GSM-EFR or GSM 06.60 is a speech coding standard that was developed in order to improve the quite poor quality of GSM-Full Rate codec. Working at 12.2 kbit/s the EFR provides wirelike quality in any noise free and background noise conditions... |
AMR_10.20 | 10.20 | FR | |
AMR_7.95 | 7.95 | FR/HR | |
AMR_7.40 | 7.40 | FR/HR | TIA/EIA IS-641 TDMA enhanced full rate IS-641 TIA/EIA standard IS-641 is a speech coding standard used in some computer and telecommunications networks in the U.S.A. Main usage was in the U.S. TDMA networks defined by IS-136. The bit rate of the speech codec is 7.4 kbit/s. This codec is the same as the 7.4 kbit/s mode in the AMR speech codec.... |
AMR_6.70 | 6.70 | FR/HR | ARIB 6.7 kbit/s enhanced full rate PDC-EFR PDC Enhanced Full Rate is a speech coding standardthat was developed by ARIB in Japan and used in PDC mobile networks in Japan. It operatates with a bit-rate of 6.7 kbit/s and is based on VSELP.The PDC-EFR is compatible with the AMR mode AMR_6.70.... |
AMR_5.90 | 5.90 | FR/HR | |
AMR_5.15 | 5.15 | FR/HR | |
AMR_4.75 | 4.75 | FR/HR | |
AMR_SID | 1.80 | FR/HR | |
Features
- Sampling frequency 8 kHz/13-bit (160 samples for 20 ms frames), filtered to 200–3400 Hz.
- The AMR codec uses eight source codecs with bit-rates of 12.2, 10.2, 7.95, 7.40, 6.70, 5.90, 5.15 and 4.75 kbit/s.
- Generates frame length of 95, 103, 118, 134, 148, 159, 204, or 244 samples for bit rates 4.75, 5.15, 5.90, 6.70, 7.40, 7.95, 10.2, or 12.2 kbit/s, respectively
- AMR utilizes Discontinuous TransmissionDiscontinuous TransmissionDiscontinuous transmission is a means by which a mobile telephone is temporarily shut off or muted while the phone lacks a voice input.-Misconception:...
(DTX), with Voice Activity DetectionVoice activity detectionVoice activity detection , also known as speech activity detection or speech detection, is a technique used in speech processing in which the presence or absence of human speech is detected.. The main uses of VAD are in speech coding and speech recognition...
(VAD) and Comfort Noise GenerationComfort noiseComfort noise is synthetic background noise used in radio and wireless communications to fill the artificial silence in a transmission resulting from voice activity detection or from the audio clarity of modern digital lines....
(CNG) to reduce bandwidth usage during silence periods - Algorithmic delay is 20 ms per frame. For bit-rates of 12.2, there is no 'algorithm' look-ahead delay. For other rates, look-ahead delay is 5 ms. Note that there is 5 ms 'dummy' look-ahead delay, to allow seamless frame-wise mode switching with the rest of rates.
- AMR is a hybrid speech coder which uses Algebraic Code Excited Linear Prediction (ACELP)
- The complexity of the algorithm is rated at 5, using a relative scale where G.711G.711G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...
is 1 and G.729a is 15. - PSQMPSQMPSQM is a computational and modeling algorithm defined in ITU Recommendation ITU-T P.861 that objectively evaluates and quantifies voice quality of voice-band speech codecs....
testing under ideal conditions yields Mean Opinion ScoreMean Opinion ScoreThe Mean Opinion Score test has been used for decades in telephony networks to obtain the human user's view of the quality of the network. In multimedia especially when codecs are used to compress the bandwidth requirement , the mean opinion score ...
s of 4.14 for AMR (12.2 kbit/s), compared to 4.45 for G.711G.711G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...
(u-law) - PSQMPSQMPSQM is a computational and modeling algorithm defined in ITU Recommendation ITU-T P.861 that objectively evaluates and quantifies voice quality of voice-band speech codecs....
testing under network stress yields Mean Opinion ScoreMean Opinion ScoreThe Mean Opinion Score test has been used for decades in telephony networks to obtain the human user's view of the quality of the network. In multimedia especially when codecs are used to compress the bandwidth requirement , the mean opinion score ...
s of 3.79 for AMR (12.2 kbit/s), compared to 4.13 for G.711G.711G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...
(u-law)
Licensing and patent issues
AMR codecs incorporate several patentPatent
A patent is a form of intellectual property. It consists of a set of exclusive rights granted by a sovereign state to an inventor or their assignee for a limited period of time in exchange for the public disclosure of an invention....
s of Nokia Corporation, Telefonaktiebolaget L. M. Ericsson, VoiceAge Corporation and Nippon Telegraph and Telephone
Nippon Telegraph and Telephone
, commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked the 31st in Fortune Global 500, NTT is the largest telecommunications company in Asia, and the second-largest in the world in terms of revenue....
Corporation. VoiceAge Corporation is the License Administrator for the AMR and AMR-WB+ patent pool
Patent pool
In patent law, a patent pool is a consortium of at least two companies agreeing to cross-license patents relating to a particular technology. The creation of a patent pool can save patentees and licensees time and money, and, in case of blocking patents, it may also be the only reasonable method...
s. VoiceAge also accepts submission of patents for determination of their possible essentiality to these standards.
The initial fee for professional content creation tools and "real-time channel" products is $6,500. The minimum annual royalty shall be $10,000, excluding the initial fee in year 1 of the license agreement.
AMR decoder in a category of personal computer products (e.g. media players) is licensed for free. The license fee for a sold encoder is $0.40. The minimum annual royalty will not apply to licensed products which fall under category of personal computer products and which contain only the free decoder.
For more information about this, please refer to:
- VoiceAge licensing information, including pricing to license the AMR codecs
- 3GPP legal issues
- The 3G Patent Platform and its licensing policy
- AMR Codecs as Shared Libraries - legal notices for usage of amrnb and amrwb libraries based on the reference implementation
Software support
- 3GPP TS 26.073 - AMR speech Codec (C source code) - reference implementation
- AudacityAudacityAudacity is a free software, cross-platform digital audio editor and recording application. It is available for Windows, Mac OS X, Linux and BSD.Audacity was created by Dominic Mazzoni while he was a graduate student at Carnegie Mellon University...
(beta version 1.3) via the FFmpeg integration libraries - FFmpegFFmpegFFmpeg is a free software project that produces libraries and programs for handling multimedia data. The most notable parts of FFmpeg are libavcodec, an audio/video codec library used by several other projects, libavformat, an audio/video container mux and demux library, and the ffmpeg command line...
with OpenCORE AMR libraries - Android
- AMR Codecs as Shared Libraries - amrnb and amrwb libraries development site. These libraries are based on the reference implementation and were created to prevent embedding of possibly patented source code into many open source projects.
- Open source software to convert the .amr format: RetroCode, Amr2Wav, both are in an early developmental stage
- AMR Player is freeware to play AMR audio files, and can convert AMR with MP3/WAV audio format.
- MPlayerMPlayerMPlayer is a free and open source media player. The program is available for all major operating systems, including Linux and other Unix-like systems, Microsoft Windows and Mac OS X. Versions for OS/2, Syllable, AmigaOS and MorphOS are also available. The Windows version works, with some minor...
(SMPlayerSMPlayerSMPlayer is a multiplatform multimedia player front-end for MPlayer. Released under the terms of the GNU General Public License, SMPlayer is free software.-Windows:...
, KMPlayer) - QuickTimeQuickTimeQuickTime is an extensible proprietary multimedia framework developed by Apple Inc., capable of handling various formats of digital video, picture, sound, panoramic images, and interactivity. The classic version of QuickTime is available for Windows XP and later, as well as Mac OS X Leopard and...
player and multimedia framework - RealPlayerRealPlayerRealPlayer is a cross-platform media player by RealNetworks that plays a number of multimedia formats including MP3, MPEG-4, QuickTime, Windows Media, and multiple versions of proprietary RealAudio and RealVideo formats.-History:...
version 11 and later - VLC media playerVLC media playerVLC media player is a free and open source media player and multimedia framework written by the VideoLAN project.VLC is a portable multimedia player, encoder, and streamer supporting many audio and video codecs and file formats as well as DVDs, VCDs, and various streaming protocols. It is able to...
version 1.1.0 and later - ffdshowFfdshowffdshow is a media decoder and encoder mainly used for the fast and high-quality decoding of video in the MPEG-4 ASP and AVC formats, but it supports numerous other video and audio formats as well...
- Apple iPhoneIPhoneThe iPhone is a line of Internet and multimedia-enabled smartphones marketed by Apple Inc. The first iPhone was unveiled by Steve Jobs, then CEO of Apple, on January 9, 2007, and released on June 29, 2007...
(can play back AMR files) - BlackBerryBlackBerryBlackBerry is a line of mobile email and smartphone devices developed and designed by Canadian company Research In Motion since 1999.BlackBerry devices are smartphones, designed to function as personal digital assistants, portable media players, internet browsers, gaming devices, and much more...
smartphones (uses for voice recorder file format)
See also
- Adaptive Multi-Rate Wideband (AMR-WB)
- Extended Adaptive Multi-Rate - Wideband (AMR-WB+)
- Half RateHalf RateHalf Rate is a speech coding system for GSM, developed in the early 1990s.Since the codec, operating at 5.6 kbit/s, requires half the bandwidth of the Full Rate codec, network capacity for voice traffic is doubled, at the expense of audio quality. It is recommended to use this codec when the...
- Full RateFull RateFull Rate or FR or GSM-FR or GSM 06.10 was the first digital speech coding standard used in the GSM digital mobile phone system. The bit rate of the codec is 13 kbit/s, or 1.625 bits/audio sample...
- Enhanced Full RateEnhanced Full RateEnhanced Full Rate or EFR or GSM-EFR or GSM 06.60 is a speech coding standard that was developed in order to improve the quite poor quality of GSM-Full Rate codec. Working at 12.2 kbit/s the EFR provides wirelike quality in any noise free and background noise conditions...
(EFR) - Sampling rateSampling rateThe sampling rate, sample rate, or sampling frequency defines the number of samples per unit of time taken from a continuous signal to make a discrete signal. For time-domain signals, the unit for sampling rate is hertz , sometimes noted as Sa/s...
- IS-641IS-641TIA/EIA standard IS-641 is a speech coding standard used in some computer and telecommunications networks in the U.S.A. Main usage was in the U.S. TDMA networks defined by IS-136. The bit rate of the speech codec is 7.4 kbit/s. This codec is the same as the 7.4 kbit/s mode in the AMR speech codec....
- 3GP3GP3GP is a multimedia container format defined by the Third Generation Partnership Project for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones....
External links
- http://amrplayer.com/
- 3GPP TS 26.090 - Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec; Transcoding functions
- 3GPP TS 26.071 - Mandatory Speech Codec speech processing functions; AMR Speech Codec; General Description
- 3GPP codecs specifications; 3G and beyond / GSM, 26 series
- RFC 4867 - RTP Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs
- RFC 4281 - The Codecs Parameter for "Bucket" Media Types