Echo cancellation
Encyclopedia
'The term echo cancellation is used in telephony
to describe the process of removing echo
from a voice communication in order to improve voice quality on a telephone call
. In addition to improving subjective quality, this process increases the capacity achieved through silence suppression
by preventing echo from traveling across a network
.
Two sources of echo have primary relevance in telephony: acoustic echo and hybrid echo.
Echo cancellation involves first recognizing the originally transmitted signal that re-appears, with some delay, in the transmitted or received signal. Once the echo is recognized, it can be removed by 'subtracting' it from the transmitted or received signal. This technique is generally implemented using a digital signal processor
(DSP), but can also be implemented in software. Echo cancellation is done using either echo suppressor
s or echo cancellers, or in some cases both.
, "echo" is very much like what one would experience yelling in a canyon. Echo is the reflected copy of the voice heard some time later and a delayed version of the original. On a telephone, if the delay is fairly significant (more than a few hundred milliseconds), it is considered annoying. If the delay is very small (10's of milliseconds or less), the phenomenon is called sidetone
and while not objectionable to humans, can interfere with the communication between data modems.
In the earlier days of telecommunications, echo suppression was used to reduce the objectionable nature of echos to human users. In essence these devices rely upon the fact that most telephone conversations are half-duplex. That is one person speaks while the other listens. An echo suppressor attempts to determine which is the primary direction and allows that channel to go forward. In the reverse channel, it places attenuation
to block or "suppress" any signal on the assumption that the signal is echo. Naturally, such a device is not perfect. There are cases where both ends are active, and other cases where one end replies faster than an echo suppressor can switch directions to keep the echo attenuated but allow the remote talker to reply without attenuation.
Echo cancellers are the replacement for earlier echo suppressor
s that were initially developed in the 1950s to control echo caused by the long delay on satellite
telecommunications circuits. Initial echo canceller theory was developed at AT&T
Bell Labs
in the 1960s, but the first commercial echo cancellers were not deployed until the late 1970s owing to the limited capability of the electronics of the era. The concept of an echo canceller is to synthesize an estimate of the echo from the talker's signal, and subtract that synthesis from the return path instead of switching attenuation into/out of the path. This technique requires adaptive signal processing
to generate a signal accurate enough to effectively cancel the echo, where the echo can differ from the original due to various kinds of degradation along the way.
Rapid advances in the implementation of digital signal processing
allowed echo cancellers to be made smaller and more cost-effective. In the 1990s, echo cancellers were implemented within voice switches for the first time (in the Northern Telecom DMS-250
) rather than as standalone devices. The integration of echo cancellation directly into the switch meant that echo cancellers could be reliably turned on or off on a call-by-call basis, removing the need for separate trunk groups for voice and data calls. Today's telephony technology often employs echo cancellers in small or handheld communications devices via a software voice engine
, which provides cancellation of either acoustic echo or the residual echo introduced by a far-end PSTN gateway system; such systems typically cancel echo reflections with up to 64 milliseconds delay.
Voice messaging and voice response systems which accept speech for caller input use echo cancellation while speech prompts are played to prevent the systems own speech recognition from falsely recognizing the echoed prompts.
—for example, the earpiece of a telephone
handset—is picked up by the microphone
in the same room—for example, the mic in the very same handset. The problem exists in any communications scenario where there is a speaker and a microphone. Examples of acoustic echo are found in everyday surroundings such as:
In most of these cases, direct sound from the loudspeaker (not the person at the far end, otherwise referred to as the Talker) enters the microphone almost unaltered. This is called direct acoustic path echo. The difficulties in cancelling acoustic echo stem from the alteration of the original sound by the ambient space. This colours the sound that re-enters the microphone. These changes can include certain frequencies being absorbed by soft furnishings, and reflection of different frequencies at varying strength. These secondary reflections are not strictly referred to as echo, but rather are "reverb
".
Acoustic echo is heard by the far end talkers in a conversation. So if a person in Room A talks, they will hear their voice bounce around in Room B. This sound needs to be cancelled, or it will get sent back to its origin. Due to the slight round-trip transmission delay, this acoustic echo is very distracting.
The acoustic echo cancellation (AEC) process works as follows:
The primary challenge for an echo canceller is determining the nature of the filtering to be applied to the far-end signal such that it resembles the resultant near-end signal. The filter is essentially a model of the speaker, microphone and the room's acoustical attributes.
To configure the filter, early echo cancellation systems required training with impulse or pink noise, and some used this as the only model of the acoustic space. Later systems used this training only as a basis to start from, and the canceller then adapted from that point on. By using the far-end signal as the stimulus, modern systems use an adaptive filter
and can 'converge' from nothing to 55 dB of cancellation in around 200 ms.
Until recently echo cancellation only needed to apply to the voice bandwidth of telephone circuits. PSTN calls transmit frequencies between 300 Hz and 3 kHz, the range required for human speech intelligibility. Videoconferencing
is one area where full bandwidth audio is transceived. In this case, specialised products are employed to perform echo cancellation.
(PSTN) through the reflection of electrical energy by a device called a hybrid
(hence the term hybrid echo). Most telephone local loop
s are two-wire circuit
s while transmission facilities are four-wire circuit
s. Each hybrid produces echoes in both directions, though the far end echo is usually a greater problem for voiceband.
" than amplitude clipping
. In an ideal situation then, echo cancellation alone will be used. However this is insufficient in many applications, notably software phones on networks with long delay and meager throughput. Here, echo cancellation and suppression can work in conjunction to achieve acceptable performance.
recommendation G.164 or G.165
.
In the 1990s most echo cancellation was done inside modem
s of type v.32 and later. In voiceband modems this allowed using the same frequencies in both directions simultaneously, greatly increasing the data rate. As part of connection negotiation, each modem sent line probe signals, measured the echoes, and set up its delay lines. Echoes in this case did not include long echoes caused by acoustic coupling, but did include short echoes caused by impedance mismatches in the 2-wire local loop
to the telephone exchange
.
After the turn of the century, DSL modems also made extensive use of automated echo cancellation. Though they used separate incoming and outgoing frequencies, these frequencies were beyond the voiceband for which the cables were designed, and often suffered attenuation distortion
due to bridge tap
s and incomplete impedance matching
. Deep, narrow frequency gaps often resulted, that could not be made usable by echo cancellation. These were detected and mapped out during connection negotiation.
Telephony
In telecommunications, telephony encompasses the general use of equipment to provide communication over distances, specifically by connecting telephones to each other....
to describe the process of removing echo
Echo (phenomenon)
In audio signal processing and acoustics, an echo is a reflection of sound, arriving at the listener some time after the direct sound. Typical examples are the echo produced by the bottom of a well, by a building, or by the walls of an enclosed room and an empty room. A true echo is a single...
from a voice communication in order to improve voice quality on a telephone call
Telephone call
A telephone call is a connection over a telephone network between the calling party and the called party.-Information transmission:A telephone call may carry ordinary voice transmission using a telephone, data transmission when the calling party and called party are using modems, or facsimile...
. In addition to improving subjective quality, this process increases the capacity achieved through silence suppression
Silence suppression
The term silence suppression is used in telephony to describe the process of not transmitting information over the network when one of the parties involved in a telephone call is not speaking, thereby reducing bandwidth usage....
by preventing echo from traveling across a network
Telecommunications network
A telecommunications network is a collection of terminals, links and nodes which connect together to enable telecommunication between users of the terminals. Networks may use circuit switching or message switching. Each terminal in the network must have a unique address so messages or connections...
.
Two sources of echo have primary relevance in telephony: acoustic echo and hybrid echo.
Echo cancellation involves first recognizing the originally transmitted signal that re-appears, with some delay, in the transmitted or received signal. Once the echo is recognized, it can be removed by 'subtracting' it from the transmitted or received signal. This technique is generally implemented using a digital signal processor
Digital signal processor
A digital signal processor is a specialized microprocessor with an architecture optimized for the fast operational needs of digital signal processing.-Typical characteristics:...
(DSP), but can also be implemented in software. Echo cancellation is done using either echo suppressor
Echo suppressor
An echo suppressor is a telecommunications device used to reduce the echo heard on long telephone circuits, particularly circuits that traverse satellite links...
s or echo cancellers, or in some cases both.
History
In telephonyTelephony
In telecommunications, telephony encompasses the general use of equipment to provide communication over distances, specifically by connecting telephones to each other....
, "echo" is very much like what one would experience yelling in a canyon. Echo is the reflected copy of the voice heard some time later and a delayed version of the original. On a telephone, if the delay is fairly significant (more than a few hundred milliseconds), it is considered annoying. If the delay is very small (10's of milliseconds or less), the phenomenon is called sidetone
Sidetone
Sidetone is audible feedback to someone who is speaking. The term is most used in telecommunication contexts.-Telephony:In telephony, sidetone is the effect of sound that is picked up by the telephone's mouthpiece and in real-time introduced at a low level into the earpiece of the same handset,...
and while not objectionable to humans, can interfere with the communication between data modems.
In the earlier days of telecommunications, echo suppression was used to reduce the objectionable nature of echos to human users. In essence these devices rely upon the fact that most telephone conversations are half-duplex. That is one person speaks while the other listens. An echo suppressor attempts to determine which is the primary direction and allows that channel to go forward. In the reverse channel, it places attenuation
Attenuation
In physics, attenuation is the gradual loss in intensity of any kind of flux through a medium. For instance, sunlight is attenuated by dark glasses, X-rays are attenuated by lead, and light and sound are attenuated by water.In electrical engineering and telecommunications, attenuation affects the...
to block or "suppress" any signal on the assumption that the signal is echo. Naturally, such a device is not perfect. There are cases where both ends are active, and other cases where one end replies faster than an echo suppressor can switch directions to keep the echo attenuated but allow the remote talker to reply without attenuation.
Echo cancellers are the replacement for earlier echo suppressor
Echo suppressor
An echo suppressor is a telecommunications device used to reduce the echo heard on long telephone circuits, particularly circuits that traverse satellite links...
s that were initially developed in the 1950s to control echo caused by the long delay on satellite
Satellite
In the context of spaceflight, a satellite is an object which has been placed into orbit by human endeavour. Such objects are sometimes called artificial satellites to distinguish them from natural satellites such as the Moon....
telecommunications circuits. Initial echo canceller theory was developed at AT&T
AT&T
AT&T Inc. is an American multinational telecommunications corporation headquartered in Whitacre Tower, Dallas, Texas, United States. It is the largest provider of mobile telephony and fixed telephony in the United States, and is also a provider of broadband and subscription television services...
Bell Labs
Bell Labs
Bell Laboratories is the research and development subsidiary of the French-owned Alcatel-Lucent and previously of the American Telephone & Telegraph Company , half-owned through its Western Electric manufacturing subsidiary.Bell Laboratories operates its...
in the 1960s, but the first commercial echo cancellers were not deployed until the late 1970s owing to the limited capability of the electronics of the era. The concept of an echo canceller is to synthesize an estimate of the echo from the talker's signal, and subtract that synthesis from the return path instead of switching attenuation into/out of the path. This technique requires adaptive signal processing
Adaptive filter
An adaptive filter is a filter that self-adjusts its transfer function according to an optimization algorithm driven by an error signal. Because of the complexity of the optimization algorithms, most adaptive filters are digital filters. By way of contrast, a non-adaptive filter has a static...
to generate a signal accurate enough to effectively cancel the echo, where the echo can differ from the original due to various kinds of degradation along the way.
Rapid advances in the implementation of digital signal processing
Digital signal processing
Digital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...
allowed echo cancellers to be made smaller and more cost-effective. In the 1990s, echo cancellers were implemented within voice switches for the first time (in the Northern Telecom DMS-250
DMS-100
The DMS-100 Switch is a line of Digital Multiplex System telephone exchange switches manufactured by Nortel Networks.The purpose of the DMS-100 Switch is to provide local service and connections to the PSTN public telephone network. It is designed to deliver services over subscribers' telephone...
) rather than as standalone devices. The integration of echo cancellation directly into the switch meant that echo cancellers could be reliably turned on or off on a call-by-call basis, removing the need for separate trunk groups for voice and data calls. Today's telephony technology often employs echo cancellers in small or handheld communications devices via a software voice engine
Voice engine
A voice engine is a software subsystem for bidirectional audio communication, typically used as part of a telecommunications system to simulate a telephone. It functions like a data pump for audio data, specifically voice data...
, which provides cancellation of either acoustic echo or the residual echo introduced by a far-end PSTN gateway system; such systems typically cancel echo reflections with up to 64 milliseconds delay.
Voice messaging and voice response systems which accept speech for caller input use echo cancellation while speech prompts are played to prevent the systems own speech recognition from falsely recognizing the echoed prompts.
Acoustic echo
Acoustic echo arises when sound from a loudspeakerLoudspeaker
A loudspeaker is an electroacoustic transducer that produces sound in response to an electrical audio signal input. Non-electrical loudspeakers were developed as accessories to telephone systems, but electronic amplification by vacuum tube made loudspeakers more generally useful...
—for example, the earpiece of a telephone
Telephone
The telephone , colloquially referred to as a phone, is a telecommunications device that transmits and receives sounds, usually the human voice. Telephones are a point-to-point communication system whose most basic function is to allow two people separated by large distances to talk to each other...
handset—is picked up by the microphone
Microphone
A microphone is an acoustic-to-electric transducer or sensor that converts sound into an electrical signal. In 1877, Emile Berliner invented the first microphone used as a telephone voice transmitter...
in the same room—for example, the mic in the very same handset. The problem exists in any communications scenario where there is a speaker and a microphone. Examples of acoustic echo are found in everyday surroundings such as:
- Hands-free car phone systems
- A standard telephone or cellphone in speakerphoneSpeakerphoneA speakerphone is a telephone with a microphone and loudspeaker provided separately from those in the handset. This device allows multiple persons to participate in a conversation...
or hands-free mode - Dedicated standalone "conference phones"
- Installed room systems which use ceiling speakers and microphones on the table
- Physical coupling (vibrations of the loudspeaker transfer to the microphone via the handset casing)
In most of these cases, direct sound from the loudspeaker (not the person at the far end, otherwise referred to as the Talker) enters the microphone almost unaltered. This is called direct acoustic path echo. The difficulties in cancelling acoustic echo stem from the alteration of the original sound by the ambient space. This colours the sound that re-enters the microphone. These changes can include certain frequencies being absorbed by soft furnishings, and reflection of different frequencies at varying strength. These secondary reflections are not strictly referred to as echo, but rather are "reverb
Reverberation
Reverberation is the persistence of sound in a particular space after the original sound is removed. A reverberation, or reverb, is created when a sound is produced in an enclosed space causing a large number of echoes to build up and then slowly decay as the sound is absorbed by the walls and air...
".
Acoustic echo is heard by the far end talkers in a conversation. So if a person in Room A talks, they will hear their voice bounce around in Room B. This sound needs to be cancelled, or it will get sent back to its origin. Due to the slight round-trip transmission delay, this acoustic echo is very distracting.
Acoustic echo cancellation
Since invention at AT&T Bell Labs echo cancellation algorithms have been improved and honed. Like all echo cancelling processes, these first algorithms were designed to anticipate the signal which would inevitably re-enter the transmission path, and cancel it out.The acoustic echo cancellation (AEC) process works as follows:
- A far-end signal is delivered to the system.
- The far-end signal is reproduced by the speaker in the room.
- A microphone also in the room picks up the resulting direct path sound, and consequent reverberant sound as a near-end signal.
- The far-end signal is filtered and delayed to resemble the near-end signal.
- The filtered far-end signal is subtracted from the near-end signal.
- The resultant signal represents sounds present in the room excluding any direct or reverberated sound produced by the speaker.
The primary challenge for an echo canceller is determining the nature of the filtering to be applied to the far-end signal such that it resembles the resultant near-end signal. The filter is essentially a model of the speaker, microphone and the room's acoustical attributes.
To configure the filter, early echo cancellation systems required training with impulse or pink noise, and some used this as the only model of the acoustic space. Later systems used this training only as a basis to start from, and the canceller then adapted from that point on. By using the far-end signal as the stimulus, modern systems use an adaptive filter
Adaptive filter
An adaptive filter is a filter that self-adjusts its transfer function according to an optimization algorithm driven by an error signal. Because of the complexity of the optimization algorithms, most adaptive filters are digital filters. By way of contrast, a non-adaptive filter has a static...
and can 'converge' from nothing to 55 dB of cancellation in around 200 ms.
Until recently echo cancellation only needed to apply to the voice bandwidth of telephone circuits. PSTN calls transmit frequencies between 300 Hz and 3 kHz, the range required for human speech intelligibility. Videoconferencing
Videoconferencing
Videoconferencing is the conduct of a videoconference by a set of telecommunication technologies which allow two or more locations to interact via two-way video and audio transmissions simultaneously...
is one area where full bandwidth audio is transceived. In this case, specialised products are employed to perform echo cancellation.
Hybrid echo
Hybrid echo is generated by the public switched telephone networkPublic switched telephone network
The public switched telephone network is the network of the world's public circuit-switched telephone networks. It consists of telephone lines, fiber optic cables, microwave transmission links, cellular networks, communications satellites, and undersea telephone cables, all inter-connected by...
(PSTN) through the reflection of electrical energy by a device called a hybrid
Telephone hybrid
Telephone hybrids are an essential functional component of the Public Switched Telephone Network . The term also describes the piece of equipment used in broadcast facilities to enable the airing of telephone callers....
(hence the term hybrid echo). Most telephone local loop
Local loop
In telephony, the local loop is the physical link or circuit that connects from the demarcation point of the customer premises to the edge of the carrier or telecommunications service provider's network...
s are two-wire circuit
Two-wire circuit
In telecommunication, a two-wire circuit is characterized by supporting transmission in two directions simultaneously, as opposed to four-wire circuits, which have separate pairs for transmit and receive. In either case they are twisted pairs. Telephone lines are almost all two wire, while trunks...
s while transmission facilities are four-wire circuit
Four-wire circuit
In telecommunication, a four-wire circuit is a two-way circuit using two paths so arranged that the respective signals are transmitted in one direction only by one path and in the other direction by the other path...
s. Each hybrid produces echoes in both directions, though the far end echo is usually a greater problem for voiceband.
Retaining echo suppressors
Echo suppression may have the side-effect of removing valid signals from the transmission. This can cause audible signal loss that is called "clipping" in telephony, but the effect is more like a "squelchSquelch
In telecommunications, squelch is a circuit function that acts to suppress the audio output of a receiver in the absence of a sufficiently strong desired input signal.-Carrier squelch:...
" than amplitude clipping
Clipping (signal processing)
Clipping is a form of distortion that limits a signal once it exceeds a threshold. Clipping may occur when a signal is recorded by a sensor that has constraints on the range of data it can measure, it can occur when a signal is digitized, or it can occur any other time an analog or digital signal...
. In an ideal situation then, echo cancellation alone will be used. However this is insufficient in many applications, notably software phones on networks with long delay and meager throughput. Here, echo cancellation and suppression can work in conjunction to achieve acceptable performance.
Modems
Echo control on voice-frequency data calls that use dial-up modems may cause data corruption. Some telephone devices disable echo suppression or echo cancellation when they detect the 2100 or 2225 Hz "answer" tones associated with such calls, in accordance with ITU-TITU-T
The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications....
recommendation G.164 or G.165
G.165
G.165 is an ITU-T standard for echo cancellers. It is primarily used in telephony. The standard was released for usage in 1993, it was superseded by the G.168....
.
In the 1990s most echo cancellation was done inside modem
Modem
A modem is a device that modulates an analog carrier signal to encode digital information, and also demodulates such a carrier signal to decode the transmitted information. The goal is to produce a signal that can be transmitted easily and decoded to reproduce the original digital data...
s of type v.32 and later. In voiceband modems this allowed using the same frequencies in both directions simultaneously, greatly increasing the data rate. As part of connection negotiation, each modem sent line probe signals, measured the echoes, and set up its delay lines. Echoes in this case did not include long echoes caused by acoustic coupling, but did include short echoes caused by impedance mismatches in the 2-wire local loop
Local loop
In telephony, the local loop is the physical link or circuit that connects from the demarcation point of the customer premises to the edge of the carrier or telecommunications service provider's network...
to the telephone exchange
Telephone exchange
In the field of telecommunications, a telephone exchange or telephone switch is a system of electronic components that connects telephone calls...
.
After the turn of the century, DSL modems also made extensive use of automated echo cancellation. Though they used separate incoming and outgoing frequencies, these frequencies were beyond the voiceband for which the cables were designed, and often suffered attenuation distortion
Attenuation Distortion
Attenuation distortion is the distortion of an analog signal that occurs during transmission when the transmission medium does not have a flat frequency response across the bandwidth of the medium or the frequency spectrum of the signal....
due to bridge tap
Bridge tap
Bridged tap or bridge tap is a long-used method of cabling for telephone lines. One cable pair will "appear" in several different terminal locations . This allows the telephone company to use or "assign" that pair to any subscriber near those terminal locations. Once that customer disconnects,...
s and incomplete impedance matching
Impedance matching
In electronics, impedance matching is the practice of designing the input impedance of an electrical load to maximize the power transfer and/or minimize reflections from the load....
. Deep, narrow frequency gaps often resulted, that could not be made usable by echo cancellation. These were detected and mapped out during connection negotiation.
See also
- Signal reflectionSignal reflectionSignal reflection occurs when a signal is transmitted along a transmission medium, such as a copper cable or an optical fiber, some of the signal power may be reflected back to its origin rather than being carried all the way along the cable to the far end. This happens because imperfections in the...
- Voice engineVoice engineA voice engine is a software subsystem for bidirectional audio communication, typically used as part of a telecommunications system to simulate a telephone. It functions like a data pump for audio data, specifically voice data...
- Audio feedbackAudio feedbackAudio feedback is a special kind of positive feedback which occurs when a sound loop exists between an audio input and an audio output...
- Least mean squares
External links
- Echo Cancellation and Noise Cancellation for VoIP (SoliCall)
- Echo Cancellation and Noise Cancellation for Hands-Free Applications (Acoustic Technologies)
- World-Class Echo Cancellation Algorithms (Adaptive Digital Technologies)
- Acoustic and line echo cancellation for hands-free applications and telephony (Digital Speech Algorithms)
- Echo cancellation and Voice Quality Enhancement Solutions (Octasic)
- Echo cancellation (International Engineering Consortium on-line education topic)
- Echo cancellation technology (IBM)
- Echo basics tutorial, including echo cancellers and echo's effect on QoS (Ditech Networks)
- White paper - Acoustic echo cancellers for mobile devices (IntegrIT)
- Basics of line echo cancellers implementation, including sample C source code (David Rowe, Open Source Line Echo Canceller)
- Sophisticated Acoustic Echo Cancellation and Speech Enhancement for Automotive (QNX Software Systems)
- Echo Cancellation and Noise Cancellation (DSP INNOVATIONS INC)