Mean Opinion Score
Encyclopedia
The Mean Opinion Score test has been used for decades in telephony networks to obtain the human user's view of the quality of the network. In multimedia (audio, voice telephony, or video) especially when codecs are used to compress the bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

 requirement (for example, of a digitized voice connection from the standard 64 kilobit/second PCM
Pulse-code modulation
Pulse-code modulation is a method used to digitally represent sampled analog signals. It is the standard form for digital audio in computers and various Blu-ray, Compact Disc and DVD formats, as well as other uses such as digital telephone systems...

 modulation
Modulation
In electronics and telecommunications, modulation is the process of varying one or more properties of a high-frequency periodic waveform, called the carrier signal, with a modulating signal which typically contains information to be transmitted...

), the mean opinion score (MOS) provides a numerical indication of the perceived quality from the users' perspective of received media after compression and/or transmission. The MOS is expressed as a single number in the range 1 to 5, where 1 is lowest perceived audio quality, and 5 is the highest perceived audio quality measurement
Audio quality measurement
Audio quality measurement seeks to quantify the various forms of corruption present in an audio system or device. The results of such measurement are used to maintain standards in broadcasting, to compile specifications, and to compare pieces of equipment....

.

MOS tests for voice are specified by ITU-T
ITU-T
The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications....

 recommendation P.800

The MOS is generated by averaging the results of a set of standard, subjective tests where a number of listeners rate the heard audio quality of test sentences read aloud by both male and female speakers over the communications medium being tested. A listener is required to give each sentence a rating using the following rating scheme:
Mean opinion score (MOS)
MOS Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible but not annoying
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying


The MOS is the arithmetic mean
Arithmetic mean
In mathematics and statistics, the arithmetic mean, often referred to as simply the mean or average when the context is clear, is a method to derive the central tendency of a sample space...

 of all the individual scores, and can range from 1 (worst) to 5 (best).

Compressor/decompressor (codec
Codec
A codec is a device or computer program capable of encoding or decoding a digital data stream or signal. The word codec is a portmanteau of "compressor-decompressor" or, more commonly, "coder-decoder"...

) systems and digital signal processing (DSP
Digital signal processor
A digital signal processor is a specialized microprocessor with an architecture optimized for the fast operational needs of digital signal processing.-Typical characteristics:...

) are commonly used in voice communications, and can be configured to conserve bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

, but there is a trade-off between voice quality and bandwidth conservation. The best codecs provide the most bandwidth conservation while producing the least degradation of voice quality. Bandwidth can be measured quantitatively, but voice quality requires human interpretation, although estimates of voice quality can be made by automatic test systems.

A similar process can be used to evaluate subjective video quality
Subjective video quality
Subjective video quality is a subjective characteristic of video quality. It is concerned with how video is perceived by a viewer and designates his or her opinion on a particular video sequence...

.

As an example, the following are mean opinion scores for one implementation of different codecs http://www.cisco.com/en/US/tech/tk1077/technologies_tech_note09186a00800b6710.shtml#mos:
Codec Data rate
[kbit/s]
Mean opinion score
(MOS)
G.711
G.711
G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...

 (ISDN)
64 4.1
iLBC
ILBC
Internet Low Bitrate Codec is an open source royalty-free narrowband speech codec, developed by Global IP Solutions formerly Global IP Sound . It was formerly licensed as a freeware with limited commercial use, but since 2011 it is available under an open source license as a part of the open...

15.2 4.14
AMR 12.2 4.14
G.729
G.729
G.729 is an audio data compression algorithm for voice that compresses digital voice in packets of 10 milliseconds duration. It is officially described as Coding of speech at 8 kbit/s using conjugate-structure algebraic code-excited linear prediction .Because of its low bandwidth requirements,...

8 3.92
G.723.1
G.723.1
G.723.1 is an audio codec for voice that compresses voice audio in 30 ms frames. An algorithmic look-ahead of 7.5 ms duration means that total algorithmic delay is 37.5 ms...

 r63
6.3 3.9
GSM EFR
Enhanced Full Rate
Enhanced Full Rate or EFR or GSM-EFR or GSM 06.60 is a speech coding standard that was developed in order to improve the quite poor quality of GSM-Full Rate codec. Working at 12.2 kbit/s the EFR provides wirelike quality in any noise free and background noise conditions...

12.2 3.8
G.726 ADPCM
G.726
G.726 is an ITU-T ADPCM speech codec standard covering the transmission of voice at rates of 16, 24, 32, and 40 kbit/s. It was introduced to supersede both G.721, which covered ADPCM at 32 kbit/s, and G.723, which described ADPCM for 24 and 40 kbit/s. G.726 also introduced a new...

32 3.85
G.729a 8 3.7
G.723.1
G.723.1
G.723.1 is an audio codec for voice that compresses voice audio in 30 ms frames. An algorithmic look-ahead of 7.5 ms duration means that total algorithmic delay is 37.5 ms...

 r53
5.3 3.65
G.728
G.728
G.728 is an ITU-T standard for speech coding operating at 16 kbit/s. It is officially described as Coding of speech at 16 kbit/s using low-delay code excited linear prediction....

16 3.61
GSM FR
Full Rate
Full Rate or FR or GSM-FR or GSM 06.10 was the first digital speech coding standard used in the GSM digital mobile phone system. The bit rate of the codec is 13 kbit/s, or 1.625 bits/audio sample...

12.2 3.5


A drawback of obtaining MOS estimations is that it may be more time-consuming and expensive as it requires hiring experts to make estimations. When a voice coding system is under development, or the developer has to test and compare a couple of audio systems, it's very important to have a possibility for a quick check.

Some suitable English-language phrases used for determining a MOS as suggested by ITU-T
ITU-T
The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications....

 recommendation P.800 are:
  • You will have to be very quiet.
  • There was nothing to be seen.
  • They worshipped wooden idols.
  • I want a minute with the inspector.
  • Did he need any money?


There exist some analytical formulas to estimate the MOS from packet losses in percentage and the packets duration in ms (see External Links referenced paper):
Predicted MOS = 4.0 - 0.7 ln(%loss) - 0.1 ln(size_ms)

See also

  • Subjective video quality
    Subjective video quality
    Subjective video quality is a subjective characteristic of video quality. It is concerned with how video is perceived by a viewer and designates his or her opinion on a particular video sequence...

  • MUSHRA
    MUSHRA
    MUSHRA stands for MUltiple Stimuli with Hidden Reference and Anchor and is a methodology for subjective evaluation of audio quality, to evaluate the perceived quality of the output from lossy audio compression algorithms. It is defined by ITU-R recommendation BS.1534-1. The MUSHRA methodology is...

     ITU BS.1534 Recommendation
  • PSQM
    PSQM
    PSQM is a computational and modeling algorithm defined in ITU Recommendation ITU-T P.861 that objectively evaluates and quantifies voice quality of voice-band speech codecs....

     Perceptual Speech Quality Measure (ITU-T P.861 - withdrawn and replaced with PESQ
    PESQ
    PESQ, Perceptual Evaluation of Speech Quality, is a family of standards comprising a test methodology for automated assessment of the speech quality as experienced by a user of a telephony system. It is standardised as ITU-T recommendation P.862...

     ITU-T P.862)
  • PESQ
    PESQ
    PESQ, Perceptual Evaluation of Speech Quality, is a family of standards comprising a test methodology for automated assessment of the speech quality as experienced by a user of a telephony system. It is standardised as ITU-T recommendation P.862...

     Perceptual Evaluation of Speech Quality, is mechanism for automated assessment of the speech quality enjoyed by the user of a telephone system. It is standardised as ITU-T recommendation P.862 (02/01).
  • PEVQ
    PEVQ
    PEVQ ' is a standardized end-to-end measurement algorithm to score the picture quality of a video presentation by means of a 5-point mean opinion score...

     Perceptual Evaluation of Video Quality, a measurement algorithm for the automated assessment of video quality.
  • PEAQ
    PEAQ
    PEAQ is a standardized algorithm for objectively measuring perceived audio quality, developed in 1994-1998 by a joint venture of experts within Task Group 6Q of the International Telecommunication Union . It was originally released as ITU-R Recommendation BS.1387 in 1998 and last updated in 2001...

     Perceptual Evaluation of Audio Quality, a measurement algorithm for the automated assessment of audio quality.
  • Absolute Category Rating
    Absolute Category Rating
    Absolute Category Rating is a test method used in quality tests. It has been standardized in ITU-T Recommendation P.910. In this method, a single test condition is presented to the viewers once only. They should then give a quality rating on an ACR scale. Test conditions should be presented in...

  • MNRU

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK