Latency (engineering)
Encyclopedia
Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.
(the one-way latency from source to destination plus the one-way latency from the destination back to the source). Round-trip latency is more often quoted, because it can be measured from a single point. Note that round trip latency excludes the amount of time that a destination system spends processing the packet. Many software platforms provide a service called ping
that can be used to measure round-trip latency. Ping performs no packet processing; it merely sends a response back when it receives a packet (i.e. performs a no-op
), thus it is a relatively accurate way of measuring latency.
Where precision is important, one-way latency for a link can be more strictly defined as the time from the start of packet transmission to the start of packet reception.The time from the start of packet transmission to the end of packet transmission at the near end is measured separately and called serialization delay. This definition of latency depends on the throughput of the link and the size of the packet, and is the time required by the system to signal the full packet to the channel.
However, in a non-trivial network, a typical packet will be forwarded over many links via many gateways, each of which will not begin to forward the packet until it has been completely received. In such a network, the minimal latency is the sum of the minimum latency of each link, plus the transmission delay of each link except the final one, plus the forwarding latency of each gateway. In practice, this minimal latency is further augmented by queuing and processing delays. Queuing delay
occurs when a gateway receives multiple packets from different sources heading towards the same destination. Since typically only one packet can be transmitted at a time, some of the packets must queue for transmission, incurring additional delay. Processing delay
s are incurred while a gateway determines what to do with a newly received packet. A new and emergent behavior called Bufferbloat
can also cause increased latency that is an order of magnitude or more. The combination of propagation, serialization, queuing, and processing delays often produces a complex and variable network latency profile.
, which is 299,792,458 meters/second in vacuum. This would equate to a latency of 3.33 microseconds for every kilometer of path length. Because the index of refraction of most fibre optic cables is about 1.5, light travels about 1.5 times as fast in a vacuum as it does in the cable. This works out to about 4.9 microseconds of latency for every kilometer. In shorter metro networks, the latency performance rises a bit more due to building risers and cross-connects and can bring the latency as high as 5 microseconds per kilometer.
It follows that to calculate latency of a connection, one has to know the distance traveled by the fiber, which is rarely a straight line, since it has to traverse geographic contours and obstacles, such as roads and railway tracks, as well as other rights-of-way.
Due to imperfections in the fiber, light degrades as it is transmitted through it. For distances of greater than 100 kilometers, either amplifiers or regenerators need to be deployed. Accepted wisdom has it that amplifiers add less latency than regenerators, though in both cases it can be highly variable, and so needs to be taken into account. In particular, legacy spans are more likely to make use of higher latency regenerators.
to the reporter and then goes all the way back to geosynchronous orbit and then to the studio, resulting in a journey of over one hundred thousand kilometers . This time lag
is easily noticeable. Even though the signal travels at the speed of light, it still requires about half a second to travel that distance (not including the much smaller latencies inside the communications equipment).
Low-Earth orbit is sometimes used to cut this delay, but at the expense of more complicated satellite tracking on the ground, and the expense of requiring more satellites in the satellite constellation
to ensure continuous coverage.
, transmission time
, digital-to-analog conversion and the speed of sound
in air.
within a system of workflows can be subject to some type of operational latency. It may even be the case that an individual system may have more than one type of latency, depending on the type of participant or goal-seeking behavior. This is best illustrated by the following two examples involving air travel
.
to New York
. The latency of his trip is the time it takes him to go from his house in England to the hotel he is staying at in New York. This is independent of the throughput of the London-New York air link – whether there were 100 passengers a day making the trip or 10000, the latency of the trip would remain the same.
Assuming the above are done one after another, minimum plane turnaround time is:
However, cleaning, refueling and loading the cargo can be done at the same time. Passengers can be loaded after cleaning is complete. Reduced latency is:
All of the people involved in the turnaround are only interested in the time it takes for their respective task, not the whole. However when different tasks are done at the same time it might be possible to reduce the latency to the longest task. However, the more prerequisites every step has, the harder it is to perform the steps in parallel. In the above example, the requirement to clean the plane before loading passengers results in a minimum latency longer than any single task.
and software to perform polling
, interrupt
s, and direct memory access
.
. In operating system
s, the execution of the executable can be postponed if other executables (a.k.a. processes) are also executing. In addition, the operating system can schedule when to perform the action that the executable is commanding. For example, suppose a process commands that a computer card's voltage output be set high-low-high-low..etc. at a rate of 1000 Hz. The operating system may choose to adjust the scheduling of each transition (high-low or low-high) based on an internal clock. The latency is the delay between the executable instruction commanding the transition and the hardware actually transitioning the voltage from high to low or low to high.
On Microsoft Windows, it appears that the timing of commands to hardware is not exact. Empirical data suggest that Windows (using the Windows sleep timer which accepts millisecond sleep times) will schedule on a 1024 Hz clock and will delay 24 of 1024 transitions per second to make an average of 1000 Hz for the update rate. This can have serious ramifications for discrete-time algorithms that rely on fairly consistent timing between updates such as those found in control theory. The sleep function or similar windows API were at no point designed for accurate timing purposes. Certain multimedia-oriented API routines like
Linux
may have the same problems with scheduling of hardware I/O. The problem in Linux is mitigated by the fact that the operating system kernel's process scheduler can be replaced by a real-time scheduler.
On embedded systems, the real-time execution of instructions is expected from the low-level embedded operating system.
s (1/1,000 sec), between initial input and an output clearly discernible to the simulator trainee or simulator subject. Latency is sometimes also called transport delay.
Packet-switched networks
Latency in a packet-switched network is measured either one-way (the time from the source sending a packet to the destination receiving it), or round-tripRound-trip delay time
In telecommunications, the round-trip delay time or round-trip time is the length of time it takes for a signal to be sent plus the length of time it takes for an acknowledgment of that signal to be received...
(the one-way latency from source to destination plus the one-way latency from the destination back to the source). Round-trip latency is more often quoted, because it can be measured from a single point. Note that round trip latency excludes the amount of time that a destination system spends processing the packet. Many software platforms provide a service called ping
Ping
Ping is a computer network administration utility used to test the reachability of a host on an Internet Protocol network and to measure the round-trip time for messages sent from the originating host to a destination computer...
that can be used to measure round-trip latency. Ping performs no packet processing; it merely sends a response back when it receives a packet (i.e. performs a no-op
NOP
In computer science, NOP or NOOP is an assembly language instruction, sequence of programming language statements, or computer protocol command that effectively does nothing at all....
), thus it is a relatively accurate way of measuring latency.
Where precision is important, one-way latency for a link can be more strictly defined as the time from the start of packet transmission to the start of packet reception.The time from the start of packet transmission to the end of packet transmission at the near end is measured separately and called serialization delay. This definition of latency depends on the throughput of the link and the size of the packet, and is the time required by the system to signal the full packet to the channel.
However, in a non-trivial network, a typical packet will be forwarded over many links via many gateways, each of which will not begin to forward the packet until it has been completely received. In such a network, the minimal latency is the sum of the minimum latency of each link, plus the transmission delay of each link except the final one, plus the forwarding latency of each gateway. In practice, this minimal latency is further augmented by queuing and processing delays. Queuing delay
Queuing delay
In telecommunication and computer engineering, the queuing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay....
occurs when a gateway receives multiple packets from different sources heading towards the same destination. Since typically only one packet can be transmitted at a time, some of the packets must queue for transmission, incurring additional delay. Processing delay
Processing delay
In a network based on packet switching, processing delay is the time it takes routers to process the packet header. Processing delay is a key component in network delay....
s are incurred while a gateway determines what to do with a newly received packet. A new and emergent behavior called Bufferbloat
Bufferbloat
Bufferbloat is a phenomenon in a packet-switched computer network whereby excess buffering of packets inside the network causes high latency and jitter, as well as reducing the overall network throughput...
can also cause increased latency that is an order of magnitude or more. The combination of propagation, serialization, queuing, and processing delays often produces a complex and variable network latency profile.
Fibre Optics
Latency is largely a function of the speed of lightSpeed of light
The speed of light in vacuum, usually denoted by c, is a physical constant important in many areas of physics. Its value is 299,792,458 metres per second, a figure that is exact since the length of the metre is defined from this constant and the international standard for time...
, which is 299,792,458 meters/second in vacuum. This would equate to a latency of 3.33 microseconds for every kilometer of path length. Because the index of refraction of most fibre optic cables is about 1.5, light travels about 1.5 times as fast in a vacuum as it does in the cable. This works out to about 4.9 microseconds of latency for every kilometer. In shorter metro networks, the latency performance rises a bit more due to building risers and cross-connects and can bring the latency as high as 5 microseconds per kilometer.
It follows that to calculate latency of a connection, one has to know the distance traveled by the fiber, which is rarely a straight line, since it has to traverse geographic contours and obstacles, such as roads and railway tracks, as well as other rights-of-way.
Due to imperfections in the fiber, light degrades as it is transmitted through it. For distances of greater than 100 kilometers, either amplifiers or regenerators need to be deployed. Accepted wisdom has it that amplifiers add less latency than regenerators, though in both cases it can be highly variable, and so needs to be taken into account. In particular, legacy spans are more likely to make use of higher latency regenerators.
Satellite transmission
This is best illustrated when a newsreader in a studio talks to a reporter half way around the world. The signal travels from the newsreader via communication satellite situated in geosynchronous orbitGeosynchronous orbit
A geosynchronous orbit is an orbit around the Earth with an orbital period that matches the Earth's sidereal rotation period...
to the reporter and then goes all the way back to geosynchronous orbit and then to the studio, resulting in a journey of over one hundred thousand kilometers . This time lag
Lag
Lag is a common word meaning to fail to keep up or to fall behind. In real-time applications, the term is used when the application fails to respond in a timely fashion to inputs...
is easily noticeable. Even though the signal travels at the speed of light, it still requires about half a second to travel that distance (not including the much smaller latencies inside the communications equipment).
Low-Earth orbit is sometimes used to cut this delay, but at the expense of more complicated satellite tracking on the ground, and the expense of requiring more satellites in the satellite constellation
Satellite constellation
A group of artificial satellites working in concert is known as a satellite constellation. Such a constellation can be considered to be a number of satellites with coordinated ground coverage, operating together under shared control, synchronised so that they overlap well in coverage and...
to ensure continuous coverage.
Audio latency
Audio latency is the delay between when an audio signal enters and when it emerges from a system. Potential contributors to latency in an audio system include analog-to-digital conversion, buffering, digital signal processingDigital signal processing
Digital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...
, transmission time
Transmission time
In telecommunication networks, the transmission time, is the amount of time from the beginning until the end of a message transmission. In the case of a digital message, it is the time from the first bit until the last bit of a message has left the transmitting node...
, digital-to-analog conversion and the speed of sound
Speed of sound
The speed of sound is the distance travelled during a unit of time by a sound wave propagating through an elastic medium. In dry air at , the speed of sound is . This is , or about one kilometer in three seconds or approximately one mile in five seconds....
in air.
Operational latency
Any individual workflowWorkflow
A workflow consists of a sequence of connected steps. It is a depiction of a sequence of operations, declared as work of a person, a group of persons, an organization of staff, or one or more simple or complex mechanisms. Workflow may be seen as any abstraction of real work...
within a system of workflows can be subject to some type of operational latency. It may even be the case that an individual system may have more than one type of latency, depending on the type of participant or goal-seeking behavior. This is best illustrated by the following two examples involving air travel
Air travel
Air travel is a form of travel in vehicles such as airplanes, helicopters, hot air balloons, blimps, gliders, hang gliding, parachuting or anything else that can sustain flight.-Domestic and international flights:...
.
Consumer view
From the point of view of a passenger, latency can be described as follows. Suppose John Doe flies from LondonLondon
London is the capital city of :England and the :United Kingdom, the largest metropolitan area in the United Kingdom, and the largest urban zone in the European Union by most measures. Located on the River Thames, London has been a major settlement for two millennia, its history going back to its...
to New York
New York
New York is a state in the Northeastern region of the United States. It is the nation's third most populous state. New York is bordered by New Jersey and Pennsylvania to the south, and by Connecticut, Massachusetts and Vermont to the east...
. The latency of his trip is the time it takes him to go from his house in England to the hotel he is staying at in New York. This is independent of the throughput of the London-New York air link – whether there were 100 passengers a day making the trip or 10000, the latency of the trip would remain the same.
Producer view
From the point of view of flight operations personnel, latency can be entirely different. Consider the staff at the London and New York airports. There are only a limited number of planes able to make the transatlantic journey, so when one lands they must prepare it for the return trip as quickly as possible. It might take, for example:- 30 minutes to clean a plane
- 15 minutes to refuel a plane
- 10 minutes to load the passengers
- 35 minutes to load the cargo
Assuming the above are done one after another, minimum plane turnaround time is:
- 30+15+10+35 = 90
However, cleaning, refueling and loading the cargo can be done at the same time. Passengers can be loaded after cleaning is complete. Reduced latency is:
- 30 + 10 = 40
- 15
- 35
- Minimum latency = 40
All of the people involved in the turnaround are only interested in the time it takes for their respective task, not the whole. However when different tasks are done at the same time it might be possible to reduce the latency to the longest task. However, the more prerequisites every step has, the harder it is to perform the steps in parallel. In the above example, the requirement to clean the plane before loading passengers results in a minimum latency longer than any single task.
Mechanical latency
Any mechanical process encounters limitations modeled by Newtonian physics. The behavior of disk drives provides an example of mechanical latency. Here, it is the time needed for the data encoded on a platter to rotate from its current position to a position adjacent to the read-write head as well as the seek time required for the actuator arm for the read-write head to be positioned above the appropriate track. This is also known as rotational latency and seek time since the basic term latency is also applied to the time required by a computer's electronicsElectronics
Electronics is the branch of science, engineering and technology that deals with electrical circuits involving active electrical components such as vacuum tubes, transistors, diodes and integrated circuits, and associated passive interconnection technologies...
and software to perform polling
Polling (computer science)
Polling, or polled operation, in computer science, refers to actively sampling the status of an external device by a client program as a synchronous activity. Polling is most often used in terms of input/output , and is also referred to as polled or software driven .Polling is sometimes used...
, interrupt
Interrupt
In computing, an interrupt is an asynchronous signal indicating the need for attention or a synchronous event in software indicating the need for a change in execution....
s, and direct memory access
Direct memory access
Direct memory access is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the central processing unit ....
.
Computer hardware and operating system latency
Computers run on a set of instructions called an executableExecutable
In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a data file that must be parsed by a program to be meaningful. These instructions are traditionally machine code instructions for a physical CPU...
. In operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
s, the execution of the executable can be postponed if other executables (a.k.a. processes) are also executing. In addition, the operating system can schedule when to perform the action that the executable is commanding. For example, suppose a process commands that a computer card's voltage output be set high-low-high-low..etc. at a rate of 1000 Hz. The operating system may choose to adjust the scheduling of each transition (high-low or low-high) based on an internal clock. The latency is the delay between the executable instruction commanding the transition and the hardware actually transitioning the voltage from high to low or low to high.
On Microsoft Windows, it appears that the timing of commands to hardware is not exact. Empirical data suggest that Windows (using the Windows sleep timer which accepts millisecond sleep times) will schedule on a 1024 Hz clock and will delay 24 of 1024 transitions per second to make an average of 1000 Hz for the update rate. This can have serious ramifications for discrete-time algorithms that rely on fairly consistent timing between updates such as those found in control theory. The sleep function or similar windows API were at no point designed for accurate timing purposes. Certain multimedia-oriented API routines like
timeGetTime
and its siblings provide better timing consistency. However, consumer- and server-grade Windows ( those based on NT kernel) were not to be real-time operating systems. Drastically more accurate timings could be achieved by using dedicated hardware extensions and control-loop cards.Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
may have the same problems with scheduling of hardware I/O. The problem in Linux is mitigated by the fact that the operating system kernel's process scheduler can be replaced by a real-time scheduler.
On embedded systems, the real-time execution of instructions is expected from the low-level embedded operating system.
Latency in simulators and simulation
In simulation applications, 'latency' refers to the time delay, normally measured in millisecondMillisecond
A millisecond is a thousandth of a second.10 milliseconds are called a centisecond....
s (1/1,000 sec), between initial input and an output clearly discernible to the simulator trainee or simulator subject. Latency is sometimes also called transport delay.
- Some authorities distinguish between latency and transport delay by using the term 'latency' in the sense of the extra time delay of a system over and above the reaction time of the vehicle being simulated, but this requires a detailed knowledge of the vehicle dynamicsVehicle dynamicsVehicle dynamics refers to the dynamics of vehicles, here assumed to be ground vehicles. Vehicle dynamics is a part of engineering primarily based on classical mechanics but it may also involve chemistry, solid state physics, electrical engineering, communications, psychology, control theory,...
and can be controversial.
- Importance of Motion and Visual Latencies. In simulators with both visual and motion systems, it is particularly important that the latency of the motion system not be greater than of the visual system, or symptoms of simulator sickness may result. This is because in the real world, motion cues are those of acceleration and are quickly transmitted to the brain, typically in less than 50 milliseconds; this is followed some milliseconds later by a perception of change in the visual scene. The visual scene change is essentially one of change of perspective and/or displacement of objects such as the horizon, which takes some time to build up to discernible amounts after the initial acceleration which caused the displacement. A simulator should therefore reflect the real-world situation by ensuring that the motion latency is equal to or less than that of the visual system and not the other way round
See also
- Comparison of latency and bandwidth
- Interrupt latencyInterrupt latencyIn real-time operating systems, interrupt latency is the time between the generation of an interrupt by a device and the servicing of the device which generated the interrupt. For many operating systems, devices are serviced as soon as the device's interrupt handler is executed...
- Lead timeLead timeA lead time is the latency between the initiation and execution of a process. For example, the lead time between the placement of an order and delivery of a new car from a manufacturer may be anywhere from 2 weeks to 6 months...
- Memory latency
- Response timeResponse timeIn technology, response time is the time a system or functional unit takes to react to a given input.- Data processing :In data processing, the response time perceived by the end user is the interval between the instant at which an operator at a terminal enters a request for a response from a...
- ThroughputThroughputIn communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...