44,100 Hz
Encyclopedia
In digital audio
Digital audio
Digital audio is sound reproduction using pulse-code modulation and digital signals. Digital audio systems include analog-to-digital conversion , digital-to-analog conversion , digital storage, processing and transmission components...

, 44,100 Hz
Hertz
The hertz is the SI unit of frequency defined as the number of cycles per second of a periodic phenomenon. One of its most common uses is the description of the sine wave, particularly those used in radio and audio applications....

is a common sampling frequency: analog audio is recorded by sampling it 44,100 times per second, and then these samples are used to reconstruct the audio signal when playing it back. "Hz" is an abbreviation for hertz
Hertz
The hertz is the SI unit of frequency defined as the number of cycles per second of a periodic phenomenon. One of its most common uses is the description of the sine wave, particularly those used in radio and audio applications....

, meaning "[cycles, samples] per second", and the alternative form 44.1 kHz (kilohertz, 1000 times per second) is also very commonly found.

44.1 kHz audio is widely used, due to this being the sampling rate used in Compact Disc
Compact Disc
The Compact Disc is an optical disc used to store digital data. It was originally developed to store and playback sound recordings exclusively, but later expanded to encompass data storage , write-once audio and data storage , rewritable media , Video Compact Discs , Super Video Compact Discs ,...

s, and its common use dates back to its use by Sony from 1979.

History

The 44.1 kHz sampling rate originated in the late 1970s with PCM adaptor
PCM adaptor
A PCM adaptor is a device used for recording digital audio in the PCM format, which in turn connects to a video cassette recorder for storage and playback of the digital audio information.-How a PCM adaptor works:...

s, which recorded digital audio
Digital audio
Digital audio is sound reproduction using pulse-code modulation and digital signals. Digital audio systems include analog-to-digital conversion , digital-to-analog conversion , digital storage, processing and transmission components...

 on video cassettes (specifically U-matic
U-matic
U-matic is an analog recording videocassette format first shown by Sony in prototype in October 1969, and introduced to the market in September 1971. It was among the first video formats to contain the videotape inside a cassette, as opposed to the various Reel-to-Reel or open-reel formats of the...

 cassettes), notably the Sony
Sony
, commonly referred to as Sony, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan and the world's fifth largest media conglomerate measured by revenues....

 PCM-1600 (1979) and subsequent model in this series. This then became the standard for Compact Disc
Compact Disc
The Compact Disc is an optical disc used to store digital data. It was originally developed to store and playback sound recordings exclusively, but later expanded to encompass data storage , write-once audio and data storage , rewritable media , Video Compact Discs , Super Video Compact Discs ,...

 audio in the Red Book standard (1980), and its use continued in 1990s standards such as MP3
MP3
MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...

 and the DVD
DVD
A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....

, and in 2000s standards such as HDMI
HDMI
HDMI is a compact audio/video interface for transmitting uncompressed digital data. It is a digital alternative to consumer analog standards, such as radio frequency coaxial cable, composite video, S-Video, SCART, component video, D-Terminal, or VGA...

.

Why 44.1 kHz?

The rate was chosen following debate between manufacturers, notably Sony
Sony
, commonly referred to as Sony, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan and the world's fifth largest media conglomerate measured by revenues....

 and Philips
Philips
Koninklijke Philips Electronics N.V. , more commonly known as Philips, is a multinational Dutch electronics company....

, and its implementation by Sony, yielding a de facto
De facto
De facto is a Latin expression that means "concerning fact." In law, it often means "in practice but not necessarily ordained by law" or "in practice or actuality, but not officially established." It is commonly used in contrast to de jure when referring to matters of law, governance, or...

 standard. The technical reasoning behind the rate being chosen is as follows.

Human hearing and signal processing

Firstly, because the high frequency limit of human hearing is about 20 kHz (the hearing range
Hearing range
For more detail on human hearing see Audiogram, Equal loudness contours and Hearing impairment.Hearing range usually describes the range of frequencies that can be heard by an animal or human, though it can also refer to the range of levels...

 of human ears is roughly 20 Hz to 20,000 Hz), and via the sampling theorem the sampling rate
Sampling rate
The sampling rate, sample rate, or sampling frequency defines the number of samples per unit of time taken from a continuous signal to make a discrete signal. For time-domain signals, the unit for sampling rate is hertz , sometimes noted as Sa/s...

 must be twice the maximum frequency one wishes to reproduce, the sampling rate had to be at least 40 kHz. In addition to this, signals must be low-pass filter
Low-pass filter
A low-pass filter is an electronic filter that passes low-frequency signals but attenuates signals with frequencies higher than the cutoff frequency. The actual amount of attenuation for each frequency varies from filter to filter. It is sometimes called a high-cut filter, or treble cut filter...

ed before sampling, otherwise aliasing
Aliasing
In signal processing and related disciplines, aliasing refers to an effect that causes different signals to become indistinguishable when sampled...

 occurs, and while an ideal low-pass filter would perfectly pass frequencies below 20 kHz (without attenuating them) and perfectly cut off frequencies above 20 kHz, in practice a transition band is necessary, where frequencies are partly attenuated. The wider this transition band is, the easier and cheaper it is to make a low-pass filter, which is in favor of a higher sampling rate. The resulting increase in sample rate is then, by the sampling theorem, twice the bandwidth
Spectral linewidth
The spectral linewidth characterizes the width of a spectral line, such as in the electromagnetic emission spectrum of an atom, or the frequency spectrum of an acoustic or electronic system...

 of the transition band – for example, a 2 kHz transition band (passing 20 kHz almost completely, cutting 22 kHz almost completely) requires 44 kHz sampling.

Recording on video equipment

Early digital audio was recorded to existing analog video cassette tapes, as these were the only available media with sufficient capacity to store meaningful lengths of audio; formally, the video cassette is the transport
Transport (recording)
A transport is a device that handles a particular physical storage medium itself, and extracts or records the information to and from the medium, to an outboard set of processing electronics that the transport is connected to.A transport houses no electronics itself for encoding and decoding the...

, and this format has been termed pseudo-video. To enable reuse with minimal modification of the video equipment, these ran at the same speed as video, and used much of the same circuitry. Specifically, audio samples were recorded as if they were on the lines of a raster scan
Raster scan
A raster scan, or raster scanning, is the rectangular pattern of image capture and reconstruction in television. By analogy, the term is used for raster graphics, the pattern of image storage and transmission used in most computer bitmap image systems...

 of video, as follows: analog video standards represent video at a field rate
Field rate
The field rate of an interlaced video image is twice the effective frame rate, since interlacing draws only half of the image at a time. For example, a field rate of 60 Hertz will correspond to a 30 frames-per-second moving picture...

 of 60 Hz (NTSC
NTSC
NTSC, named for the National Television System Committee, is the analog television system that is used in most of North America, most of South America , Burma, South Korea, Taiwan, Japan, the Philippines, and some Pacific island nations and territories .Most countries using the NTSC standard, as...

, North America – or 60/1.001 Hz ≈ 59.94 Hz for color NTSC) or 50 Hz (PAL
PAL
PAL, short for Phase Alternating Line, is an analogue television colour encoding system used in broadcast television systems in many countries. Other common analogue television systems are NTSC and SECAM. This page primarily discusses the PAL colour encoding system...

, Europe), which corresponds to a frame rate
Frame rate
Frame rate is the frequency at which an imaging device produces unique consecutive images called frames. The term applies equally well to computer graphics, video cameras, film cameras, and motion capture systems...

 of 30 frames per second (frame/s) or 25 frame/s – each field is half the lines of an interlaced image (alternating the odd lines and the even lines). Each of these fields is in turn composed of lines (see raster scan
Raster scan
A raster scan, or raster scanning, is the rectangular pattern of image capture and reconstruction in television. By analogy, the term is used for raster graphics, the pattern of image storage and transmission used in most computer bitmap image systems...

) – a frame of 625 lines for PAL and 525 lines for NTSC, though some of the "lines" are actually for synchronizing the signal (see vertical blanking interval
Vertical blanking interval
The vertical blanking interval , also known as the vertical interval or VBLANK, is the time difference between the last line of one frame or field of a raster display, and the beginning of the first line of the next frame. It is present in analog television, VGA, DVI and other signals. During the...

), and a field comprises half the visible lines in one vertical scan. Digital audio samples were then encoded along each line, thus allowing reuse of the existing synchronization circuitry – as video, the resulting images look like lines of binary black and white (rather, gray) dots along each scan line. The line frequency (lines per second) was 15,625 Hz for PAL (625 × 50/2), 15,750 Hz for 60 Hz (monochrome) NTSC (525 × 60/2), and 15,750/1.001 Hz (approximately 15,734.26 Hz) for 59.94 (color) NTSC, and thus to record audio at the required over 40 kHz required encoding multiple samples per line, with 3 samples per line being sufficient, yielding up to 15,625 × 3 = 46,875 for PAL and 15,750 × 3 = 47,250 for NTSC. One wished to minimize the number of samples per line, so that each sample could have more space devoted to it, thus making it easier to have a higher bit depth
Audio bit depth
In digital audio, bit depth describes the number of bits of information recorded for each sample. Bit depth directly corresponds to the resolution of each sample in a set of digital audio data...

 (16 bits, rather than 14 or 12 bits, say) and better error tolerance, and in practice the signal was stereo, requiring 3 × 2 = 6 samples per line. However, some of these lines were devoted to (vertical) synchronization: specifically, the lines during the vertical blanking interval
Vertical blanking interval
The vertical blanking interval , also known as the vertical interval or VBLANK, is the time difference between the last line of one frame or field of a raster display, and the beginning of the first line of the next frame. It is present in analog television, VGA, DVI and other signals. During the...

 (VBI) could not be used, so a maximum of 490 lines per frame (245 lines per field) could be used in NTSC, and about 588 lines per frame (294 lines per field) on PAL. (Note that in video PAL has (up to) 575 visible lines while NTSC has up to 485.)

NTSC and PAL compatibility

It is simplest if the same number of lines are used in each field, and, crucially, it was decided that a sample rate that could be used on both NTSC (monochrome) and PAL equipment. Since NTSC has a field rate of 60 Hz, and PAL has a field rate of 50 Hz, their least common multiple
Least common multiple
In arithmetic and number theory, the least common multiple of two integers a and b, usually denoted by LCM, is the smallest positive integer that is a multiple of both a and b...

 is 300 Hz, and with 3 samples per line, this yields a sample rate that is a multiple of 900 Hz. For NTSC the sample rate is 5m × 60 × 3, where 5m is the number of active lines per field, which must be a multiple of 5 (the rest used for synchronization), and for PAL the sample rate is 6n × 50 × 3, where 6n is the number of active lines per field, which must be a multiple of 6.

The sampling rates that satisfy these requirements – at least 40 kHz (so can encode 20 kHz sounds), no more than 46.875 kHz (so require no more than 3 samples per line in PAL), and a multiple of 900 Hz (so can be encoded in NTSC and PAL) are thus 40.5, 41.4, 42.3, 43.2, 44.1, 45, 45.9, and 46.8 kHz. The lower ones are eliminated due to low-pass filters requiring a transition band, while the higher ones are eliminated due to some lines being required for vertical blanking interval; 44.1 kHz was the higher usable rate, and was eventually chosen.

Conclusion

The actual choice of rate was the point of some debate, with other alternatives including 44,100/1.001 = 44.056 kHz (corresponding to the NTSC color field rate of 60/1.001 = 59.94 Hz) or approximately 44 kHz, proposed by Philips. Ultimately Sony prevailed on both sample rate (44.1 kHz) and bit depth (16 bits per sample, rather than 14 bits per sample).

The sample rate is composed as follows:

NTSC:
245 × 60 × 3 = 44,100
245 active lines/field × 60 fields/second × 3 samples/line = 44,100 samples/second

PAL:
294 × 50 × 3 = 44,100
294 active lines/field × 50 fields/second × 3 samples/line = 44,100 samples/second


In actual practice, different machines used different video cassettes – for example, the Sony PCM-1610 only used 525/60 monochrome video (NTSC, US), not 625/50 (PAL, Europe) or NTSC color.

Alternative rates

Several other sampling rates were also used in early digital audio, most significantly 48 kHz, discussed below in status.

Earlier rates included a 50 kHz sample rate, used by Soundstream
Soundstream
-The Company:Soundstream Inc. was founded in 1975 in Salt Lake City, Utah by Dr. Thomas G. Stockham, Jr. It was the world’s first audiophile digital audio recording company, providing commercial services for recording and computer-based editing...

 (by Thomas Stockham
Thomas Stockham
Thomas Greenway Stockham was an American scientist who developed the first practical digital audio recording system, and pioneered techniques for digital audio recording and processing as well....

) in the 1970s, following a 37 kHz prototype.

In the early 1980s, a 32 kHz sampling rate was used in broadcast (esp. in UK and Japan), because this was sufficient for FM stereo broadcasts, which had 15 kHz bandwidth.

Some digital audio was provided for domestic use in 2 incompatible EIAJ formats, with 2 incompatible, corresponding to 525/59.94 (44,056 Hz sampling) and 625/50 (44.1 kHz sampling).

Lastly, in what appears to be a coincidence, the 44.1 kHz sampling rate is exactly 4 times the line frequency of the old 441 lines
441 lines
441 lines, or 383i if named using modern standard, is an early electronic television system. It was used with 50 interlaced frames per second in France and Germany, where it was an improvement over the previous 180 lines system...

 German TV standard, which had a frequency of 441 × 50 ÷ 2 = 11,025 Hz (441 lines per frame, 50 fields per second, 2 fields per frame).

See sampling rate: audio for further rates.

Related rates

Various multiples of 44.1 kHz are used – the lower rates 11.025 kHz and 22.05 kHz are found in WAV
WAV
Waveform Audio File Format , is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs...

 files, and are suitable for low-bandwidth applications, while the higher rates of 88.2 kHz and 176.4 kHz are used in mastering and in DVD-Audio
DVD-Audio
DVD-Audio is a digital format for delivering high-fidelity audio content on a DVD. DVD-Audio is not intended to be a video delivery format and is not the same as video DVDs containing concert films or music videos....

 – the higher rates are useful both for the usual reason of providing additional resolution (hence less sensitive to distortions introduced by editing), and also making the low-pass filtering easier, since a much larger transition band (between human-audible at 20 kHz and the sampling rate) is possible. The 88.2 kHz and 176.4 kHz rates are primarily used when the ultimate target is a CD.

Consequences

Subsequently, the DAT
Digital Audio Tape
Digital Audio Tape is a signal recording and playback medium developed by Sony and introduced in 1987. In appearance it is similar to a compact audio cassette, using 4 mm magnetic tape enclosed in a protective shell, but is roughly half the size at 73 mm × 54 mm × 10.5 mm. As...

 format was released in 1987, with 48 kHz sampling, and this sample rate, which is a rounder number and also allows a larger transition band in low-pass filtering, has also become common. Converting between these sample rates – sample rate conversion
Sample rate conversion
Sample rate conversion is the process of converting a signal from one sampling rate to another, while changing the information carried by the signal as little as possible...

 – was initially difficult, due to the relatively high numbers in the ratio between these rates: 44,100:48,000 = 147:160, but is today easy. This difference was initially exploited to make it difficult to copy 44.1 kHz CDs using 48 kHz DAT equipment.

Status

Due to the popularity of CDs, a great deal of 44.1 kHz equipment exists, as does a great deal of audio recorded in 44.1 kHz (or multiples thereof). However, some more recent standards use 48 kHz in addition to or instead of 44.1 kHz. In video, 48 kHz is now the standard, but for audio targeted at CDs, 44.1 kHz (and multiples) are still used.

The HDMI
HDMI
HDMI is a compact audio/video interface for transmitting uncompressed digital data. It is a digital alternative to consumer analog standards, such as radio frequency coaxial cable, composite video, S-Video, SCART, component video, D-Terminal, or VGA...

 TV standard (2003) allows both 44.1 kHz and 48 kHz (and multiples), which provides compatibility with DVD players playing CD, VCD
VCD
VCD is a three-letter abbreviation with multiple meanings, as described below:* VCD Athletic, semi-professional football team* Video CD* Voice command device* Value change dump * Vocal cord dysfunction* Visual Communication and Design...

 and SVCD content, while the DVD
DVD
A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....

 and Blu-ray Disc
Blu-ray Disc
Blu-ray Disc is an optical disc storage medium designed to supersede the DVD format. The plastic disc is 120 mm in diameter and 1.2 mm thick, the same size as DVDs and CDs. Blu-ray Discs contain 25 GB per layer, with dual layer discs being the norm for feature-length video discs...

 standards use 48 kHz only.

Most audio processors/sound card
Sound card
A sound card is an internal computer expansion card that facilitates the input and output of audio signals to and from a computer under control of computer programs. The term sound card is also applied to external audio interfaces that use software to generate sound, as opposed to using hardware...

s contain DAC
Digital-to-analog converter
In electronics, a digital-to-analog converter is a device that converts a digital code to an analog signal . An analog-to-digital converter performs the reverse operation...

for both 44.1 kHz and 48 kHz, being able to natively output either, though some older processors include only 44.1 kHz output, and some cheaper newer processors only include 48 kHz output, requiring digital sample rate conversion to output other sample rates. Similarly, processors may be able to record natively at only certain sample rates.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK