Low latency (capital markets) - AbsoluteAstronomy.com

Low latency is a topic within capital markets, where the proliferation of algorithmic trading

Algorithmic trading

In electronic financial markets, algorithmic trading or automated trading, also known as algo trading, black-box trading or robo trading, is the use of electronic platforms for entering trading orders with an algorithm deciding on aspects of the order such as the timing, price, or quantity of the...

requires firms to react to market events faster than the competition to increase profitability of trades. For example, when executing arbitrage

Arbitrage

In economics and finance, arbitrage is the practice of taking advantage of a price difference between two or more markets: striking a combination of matching deals that capitalize upon the imbalance, the profit being the difference between the market prices...

strategies the opportunity to “arb” the market may only present itself for a few milliseconds before parity is achieved. To demonstrate the value that clients put on latency, a large global investment bank has stated that every millisecond lost results in $100m per annum in lost opportunity.

What is considered “low” is therefore relative but also a self-fulfilling prophecy. Many organisations are using the words “ultra low latency” to describe latencies of under 1 millisecond

Millisecond

A millisecond is a thousandth of a second.10 milliseconds are called a centisecond....

, but really what is considered low today will no doubt be considered unacceptable in a few years time.

Ultimately the speed of light

Speed of light

The speed of light in vacuum, usually denoted by c, is a physical constant important in many areas of physics. Its value is 299,792,458 metres per second, a figure that is exact since the length of the metre is defined from this constant and the international standard for time...

"c" dictates the limit, a trading engine just 150km (93 miles) down the road from the exchange can never achieve better than 1ms return times to the exchange. This also assumes a vacuum, in practice there are several routers, switches and cable between the two and even if there was a dedicated cable, messages still travel slower down cable than in a vacuum. As a result most trading engines can be found physically close to the exchanges, some go as far as putting their engines in the same building to further reduce latency.

A crucial factor in determining the latency of a data channel is its throughput

Throughput

In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...

. Data rates are increasing exponentially

Exponential growth

Exponential growth occurs when the growth rate of a mathematical function is proportional to the function's current value...

which has a direct relation to the speed in which messages can be processed, and low-latency systems need not only to be able to get a message from A to B as quickly as possible but also to be able to do this for millions of messages per second. See comparison of latency and throughput for a more in-depth discussion.

Latency from event to execution

When talking about latency in the context of capital markets consider the round trip between event and trade:

Event occurs at a particular venue
Information about that event is placed in a message on the wire
Message reaches the decision making application
Application makes a trade decision based upon that event
Order sent to the trading venue
Venue executes the order
Order confirmation sent back to application

We also need to consider how latency is assembled in this chain of events:

Processing, the time taken to process a message (which could be as simple as a network switch routing a packet)
Propagation, the time taken for a bit of data to get from A to B (limited by the speed of light)
Packet size divided by bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

, total message size (payload + headers), available bandwidth, number of messages being sent across the link.

There are a series of steps that contribute to the total latency of a trade:

Event occurrence to being on the wire

The systems at a particular venue need to handle events, such as order placement, and get them onto the wire as quickly as possible to be competitive within the market place. Some venues offer premium services for clients needing the quickest solutions.

Exchange to Application

This is one of the areas where most delay can be added, due to the distances involved, amount of processing by internal routing engines, hand off between different networks and the sheer amount of data which is being sent, received and processed from various data venues.

Latency is largely a function of the speed of light, which is 299,792,458 metres/second in scientifically controlled environment; which would equate to a latency of 4.8 microseconds for every kilometre. But when measuring latency of data we need to account for the fiber optic cable,though it seems pure, it is not a vacuum so refraction of light needs to be accounted for. For measuring latency in long haul networks the calculation is actually 4.9 microseconds for every kilometre. In shorter metro networks, the latency performance rises a bit more due to building risers and cross-connects and can be as bring the latency as high as 5 microseconds per kilometre.

It follows that to calculate latency of a connection, one has to know the distance travelled by the fiber, which is rarely a straight line, since it has to traverse geographic contours and obstacles, such as roads and railway tracks, as well as other rights-of-way.

Due to imperfections in the fiber, light degrades as it is transmitted through it. For distances of greater than 100 kilometres, either amplifiers or regenerators need to be deployed. Accepted wisdom has it that amplifiers add less latency than regenerators, though in both cases it can be highly variable, and so needs to be taken into account. In particular, legacy spans are more likely to make use of higher latency regenerators.

Propagation between the location of the execution venue and the location of the application
Delays in data aggregation networks such as Reuters IDN, Bloomberg, IDC and others
Propagation within internal networks
Processing within internal networks
Processing by internal routing systems
Bandwidth of extranet and internal networks
Message packet sizes
Amount of data being sent and received

Application decision making

This area isn’t really put under the umbrella of “low-latency”, rather it is the ability of the trading firm to take advantage of High Performance Computing technologies to process data quickly. However, it is included for completeness.

Processing by APIs
Processing by Applications
Propagation between internal systems
Network processing/bandwidth/packet size/propagation between internal systems

Sending the order to the venue

Similar to the delays between Exchange and Application, many trades will involve a brokerage firm

Broker

A broker is a party that arranges transactions between a buyer and a seller, and gets a commission when the deal is executed. A broker who also acts as a seller or as a buyer becomes a principal party to the deal...

and the competitiveness of the brokerage firm in many cases is directly related to the performance of their order placement and management systems.

Processing by internal order management systems
Processing by Broker systems
Propagation between Application and Broker
Propagation between Broker and Execution Venue

Order execution

The amount of time it takes for the execution venue to process and match the order

Average Latency

Average latency is the mean average time for a message to be passed from one point to another - the lower the better. Times under 1 millisecond are typical for a market data system.

Latency Jitter

There are many use cases where predictability of latency in message delivery is as, if not more important than a low average latency. This latency predictability is also referred to as Low Latency Jitter and describes a narrow deviation of latencies around the mean latency measurement.

Throughput

Throughput can be define as amount of data processed per unit of time.
Throughput refers to the number of messages being received sent and processed by the system and is usually measured in updates per second. Throughput has a correlation to latency measurements and typically as the message rate increases so do the latency figures. To give an indication of the number of messages we are dealing with the “Options Price Reporting Authority” (OPRA)

Options Price Reporting Authority

The Options Price Reporting Authority provides, through market data vendors, last sale information and current options quotations from a committee of participant exchanges designated as the Options Price Reporting Authority....

is predicting peak message rates of 907,000 updates per second (ups) on its network by July 2008. This is just a single venue – most firms will be taking updates from several venues.

Timestamping/Clocks

Clock accuracy is paramount when testing the latency between systems. Any discrepancies will give inaccurate results. Many tests involve locating the publishing node and the receiving node on the same machine to ensure the same clock time is being used. This isn’t always possible however, so clocks on different machines need to be kept in sync using some sort of time protocol:

NTP
Network Time Protocol
The Network Time Protocol is a protocol and software implementation for synchronizing the clocks of computer systems over packet-switched, variable-latency data networks. Originally designed by David L...

is limited to milliseconds, so is not accurate enough for today’s low-latency applications
CDMA time accuracy is in tens of microseconds. It is US based only. Accuracy is affected by the distance from the transmission source.
GPS is the most accurate time protocol in terms of synchronisation. It is, however, the most expensive.

Reducing Latency in the Order Chain

Reducing latency in the order chain involves attacking the problem from many angles. Amdahl's Law

Amdahl's law

Amdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved...

, commonly used to calculate performance gains of throwing more CPUs at a problem, can be applied more generally to improving latency – that is, improving a portion of a system which is already fairly inconsequential (with respect to latency) will result in minimal improvement in the overall performance.

Latency from event to execution

Event occurrence to being on the wire

Exchange to Application

Application decision making

Sending the order to the venue

Order execution

Average Latency

Latency Jitter

Throughput

Timestamping/Clocks

Reducing Latency in the Order Chain

See also