Audio to video synchronization
Encyclopedia
Audio to video synchronization (also known as audio video sync, audio/video sync, AV-sync, lip sync, or by the lack of it: lip sync error, lip-flap) refers to the relative timing
of audio
(sound) and video
(image) parts during creation, post-production
(mixing), transmission
, reception and play-back processing. When sound and video have a timing related cause and effect
, AV-sync can be an issue in television
, videoconferencing
, or film
.
Digital or analog audio video streams or video files usually contains some sort of explicit AV-sync timing, either in the form of interlaced video and audio data or by explicit relative time-stamping of data. The processing of data must respect the relative data timing by e.g. stretching between or interpolation of received data. If the processing does not respect the AV-sync error, it will increase whenever data gets lost, because of transmission errors or because of missing or mis-timed processing.
Examples of transmission (broadcasting
), reception and playback that can get the AV-sync incorrectly synchronized:
, have become involved in setting standards for audio video sync errors.
Because of these annoyances, AV-sync error is of concern to the television programming industry, including television stations, networks, advertisers and program production companies. Unfortunately the advent of high definition flat panel display technologies (LCD, DLP and plasma) which can delay video more than audio have moved the problem into the viewer's home and beyond control of the television programming industry alone. Consumer products companies now offer audio delay adjustments to compensate for video delay changes in TV's, a/v receivers, and several companies manufacture dedicated digital audio delays made exclusively for lip-sync error correction.
Timing
Timing is the time when something happens or the spacing of events in time. Some typical uses are:* The act of measuring the elapsed time of something or someone, often at athletic events such as swimming or running, where participants are timed with a device such as a stopwatch...
of audio
Sound
Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations.-Propagation of...
(sound) and video
Video
Video is the technology of electronically capturing, recording, processing, storing, transmitting, and reconstructing a sequence of still images representing scenes in motion.- History :...
(image) parts during creation, post-production
Post-production
Post-production is part of filmmaking and the video production process. It occurs in the making of motion pictures, television programs, radio programs, advertising, audio recordings, photography, and digital art...
(mixing), transmission
Transmission (telecommunications)
Transmission, in telecommunications, is the process of sending, propagating and receiving an analogue or digital information signal over a physical point-to-point or point-to-multipoint transmission medium, either wired, optical fiber or wireless...
, reception and play-back processing. When sound and video have a timing related cause and effect
Causality
Causality is the relationship between an event and a second event , where the second event is understood as a consequence of the first....
, AV-sync can be an issue in television
Television
Television is a telecommunication medium for transmitting and receiving moving images that can be monochrome or colored, with accompanying sound...
, videoconferencing
Videoconferencing
Videoconferencing is the conduct of a videoconference by a set of telecommunication technologies which allow two or more locations to interact via two-way video and audio transmissions simultaneously...
, or film
Film
A film, also called a movie or motion picture, is a series of still or moving images. It is produced by recording photographic images with cameras, or by creating images using animation techniques or visual effects...
.
Digital or analog audio video streams or video files usually contains some sort of explicit AV-sync timing, either in the form of interlaced video and audio data or by explicit relative time-stamping of data. The processing of data must respect the relative data timing by e.g. stretching between or interpolation of received data. If the processing does not respect the AV-sync error, it will increase whenever data gets lost, because of transmission errors or because of missing or mis-timed processing.
Incorrectly synchronized
There are different ways in which the AV-sync can get incorrectly synchronized:- During creation AV-sync errors happen because of
- Internal AV-sync error: Different processingData processingComputer data processing is any process that a computer program does to enter data and summarise, analyse or otherwise convert data into usable information. The process may be automated and run on a computer. It involves recording, analysing, sorting, summarising, calculating, disseminating and...
delays between image and sound in video cameraVideo cameraA video camera is a camera used for electronic motion picture acquisition, initially developed by the television industry but now common in other applications as well. The earliest video cameras were those of John Logie Baird, based on the electromechanical Nipkow disk and used by the BBC in...
and microphoneMicrophoneA microphone is an acoustic-to-electric transducer or sensor that converts sound into an electrical signal. In 1877, Emile Berliner invented the first microphone used as a telephone voice transmitter...
. The AV-sync delay is normally fixed. - External AV-sync error: If a microphone is placed far away from the sound source, the audio will be out of sync because the speed of soundSpeed of soundThe speed of sound is the distance travelled during a unit of time by a sound wave propagating through an elastic medium. In dry air at , the speed of sound is . This is , or about one kilometer in three seconds or approximately one mile in five seconds....
is much lower than the speed of lightSpeed of lightThe speed of light in vacuum, usually denoted by c, is a physical constant important in many areas of physics. Its value is 299,792,458 metres per second, a figure that is exact since the length of the metre is defined from this constant and the international standard for time...
. If the sound source is 340 meters from the microphone, then the sound arrives approximately 1 second later than the light. The AV-sync delay increases with distance.
- Internal AV-sync error: Different processing
- During mixing of video clips normally either the audio or video needs to be delayed so they are synchronized. The AV-sync delay is static, but can vary with the individual clip.
- Video editingVideo editingThe term video editing can refer to:* Linear video editing, using video tape* Non-linear editing system , using computers with video editing software* Offline editing* Online editing...
effects.
Examples of transmission (broadcasting
Broadcasting
Broadcasting is the distribution of audio and video content to a dispersed audience via any audio visual medium. Receiving parties may include the general public or a relatively large subset of thereof...
), reception and playback that can get the AV-sync incorrectly synchronized:
- A video camera with built-in microphones or line-in may not delay sound and video paths by the same number of milliseconds. A video camera should have some sort of explicit AV-sync timing put into the video and audio streams. Solid state video cameras (e.g. Charge-coupled deviceCharge-coupled deviceA charge-coupled device is a device for the movement of electrical charge, usually from within the device to an area where the charge can be manipulated, for example conversion into a digital value. This is achieved by "shifting" the signals between stages within the device one at a time...
(CCD) and CMOS image sensors) can delay the video signal by one or more frames. - An AV-stream may get corrupted during transmission because of electrical glitchGlitchA glitch is a short-lived fault in a system. It is often used to describe a transient fault that corrects itself, and is therefore difficult to troubleshoot...
es (wired) or wireless interruptions - this may cause it to become out of sync. The AV-sync delay normally increases with time. - There is extensive use of audio and video signal processing circuitry with significant delays in television systems. Particular video signal processing circuitry which is widely used and contributes significant video delays include frame synchronizers, digital video effects processors, video noise reduction, format converters and MPEG pre-preprocessing.
- The video monitor processing circuit may delay the video stream. Pixelated displays require video format conversion and deinterlace processing which can add one or more frames of video delay.
- A video monitor with built-in speakers or line-out may not delay sound and video paths by the same amount of milliseconds. Some video monitors contain internal user-adjustable audio delays to aid in correction of errors.
Recommendations
For television applications, audio should lead video by no more than 15 milliseconds and audio should lag video by no more than 45 milliseconds. For film, acceptable lip sync is considered to be no more than 22 milliseconds in either direction.MPEG: Presentation Time Stamp (PTS), Decode Time Stamp (DTS)
Presentation time stamps (PTS) can be embedded in MPEG transport stream to avoid AV-sync drift. Unfortunately these time stamps are often added after the video undergoes frame synchronization, format conversion and pre-processing, thus those delays remain uncompensated.Viewer experience of incorrectly synchronized AV-sync
The result typically leaves a filmed or televised character moving his or her mouth when there is no spoken dialog to accompany it, hence the term "lip flap" or "lip-sync error". The resulting audio video sync error can be annoying to the viewer and can even lead to the viewer's not enjoying the program, to the program's not being effective, and to the speakers being perceived negatively. The lack of effectiveness problems are of particular concern when product commercials and political candidates are viewed. Television industry standards organizations, such as the Advanced Television Systems CommitteeAdvanced Television Systems Committee
The Advanced Television Systems Committee is the group, established in 1982, that developed the eponymous ATSC Standards for digital television in the United States, also adopted by Canada, Mexico, South Korea, and recently Honduras and is being considered by other countries.-See also:*ATSC...
, have become involved in setting standards for audio video sync errors.
Because of these annoyances, AV-sync error is of concern to the television programming industry, including television stations, networks, advertisers and program production companies. Unfortunately the advent of high definition flat panel display technologies (LCD, DLP and plasma) which can delay video more than audio have moved the problem into the viewer's home and beyond control of the television programming industry alone. Consumer products companies now offer audio delay adjustments to compensate for video delay changes in TV's, a/v receivers, and several companies manufacture dedicated digital audio delays made exclusively for lip-sync error correction.
Effect of no explicit AV-sync timing
When a digital or analog audio video stream does not have some sort of explicit AV-sync timing these effects will cause the stream to become out of sync:- In film movies these timing errors are most commonly caused by worn films skipping over the movie projectorMovie projectorA movie projector is an opto-mechanical device for displaying moving pictures by projecting them on a projection screen. Most of the optical and mechanical elements, except for the illumination and sound devices, are present in movie cameras.-Physiology:...
sprockets because the film has torn sprocket holes. - Errors can also be caused by the projectionistProjectionistA Projectionist is a person who operates a movie projector. In the strict sense of the term this means any movie projector and therefore could include someone who operates the projector in a home video show or school. In common usage the term is generally understood to describe a paid employee of...
misthreading the film in the projector, although this is rare with competent projectionists. - Audio to Video Synchronization is commonly corrected and maintained with an audio synchronizerAudio synchronizerAn audio synchronizer is a variable audio delay used to correct or maintain audio video sync or timing also known as lip sync error. See for example the specification for audio to video timing given in ATSC Document IS-191...
. Television industry standards organizations have established acceptable amounts of audio and video timing errors and suggested practices related to maintaining acceptable timing. - A/V sync errors are becoming a significant problem in the digital televisionDigital televisionDigital television is the transmission of audio and video by digital signals, in contrast to the analog signals used by analog TV...
industry because of the use of large amounts of video signal processing in television production, television broadcasting and pixelated television displays such as LCD, DLP and plasma displays. - In the televisionTelevisionTelevision is a telecommunication medium for transmitting and receiving moving images that can be monochrome or colored, with accompanying sound...
field, audio video sync problems are commonly caused when significant amounts of video processingVideo processingIn electrical engineering and computer science, video processing is a particular case of signal processing, which often employs video filters and where the input and output signals are video files or video streams. Video processing techniques are used in television sets, VCRs, DVDs, video codecs,...
is performed on the video part of the television program. - Typical sources of significant video delays in the television field include video synchronizers and video compression encoders and decoders. Particularly troublesome encoders and decoders are used in MPEG compression systems utilized for broadcasting digital televisionDigital televisionDigital television is the transmission of audio and video by digital signals, in contrast to the analog signals used by analog TV...
and storing television programs on consumer and professional recording and playback devices. - A source of significant video delay is found in pixelated television displays (LCD, Plasma displayPlasma displayA plasma display panel is a type of flat panel display common to large TV displays or larger. They are called "plasma" displays because the technology utilizes small cells containing electrically charged ionized gases, or what are in essence chambers more commonly known as fluorescent...
, DLP) which utilize complex video signal processing to convert the resolution of the incoming video signal to the native resolution of the pixelated display, for example converting standard definition video to be displayed on a high definition display. "Lip-flap" may exceed 200 ms at times. - In broadcast television, it is not unusual for lip-sync error to vary by over 100 ms (several video frames) from time to time.
- The EBU Recommendation R37 “The relative timing of the sound and vision components of a television signal” states that end-to-end audio/video sync should be within +40ms and -60ms (audio before / after video, respectively) and that each stage should be within +5ms and -15ms.
See also
- Audio synchronizerAudio synchronizerAn audio synchronizer is a variable audio delay used to correct or maintain audio video sync or timing also known as lip sync error. See for example the specification for audio to video timing given in ATSC Document IS-191...
- ClapperboardClapperboardA clapperboard is a device used in filmmaking and video production to assist in the synchronizing of picture and sound, and to designate and mark particular scenes and takes recorded during a production...
- Dubbing (filmmaking)Dubbing (filmmaking)Dubbing is the post-production process of recording and replacing voices on a motion picture or television soundtrack subsequent to the original shooting. The term most commonly refers to the substitution of the voices of the actors shown on the screen by those of different performers, who may be...
- Input lagInput lagDisplay lag is a phenomenon associated with some types of LCD displays, and nearly all types of HDTVs, that refers to latency, or lag measured by the difference between the time a signal is input into a display and the time it is shown by the display. This lag time has been measured as high as...
- Lip syncLip syncLip sync, lip-sync, lip-synch is a technical term for matching lip movements with sung or spoken vocals...
- MuEvMuEvMuEv is an acronym for Mutual Events. MuEv is used in conjunction with the timing of sound and images, especially in television systems, to denote events which create temporally coincident sounds and images...
External links
- Further detailed information on lip sync error and audio synchronizer may be found by searching for these terms at the United States Patent and Trademark Office web site at http://patft.uspto.gov/netahtml/PTO/search-bool.html.