Load balancing (computing)
Load balancing is a computer network
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

ing methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy
Redundancy (engineering)
In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe....

. The load balancing service is usually provided by dedicated software or hardware, such as a multilayer switch
Multilayer switch
A multilayer switch is a computer networking device that switches on OSI layer 2 like an ordinary network switch and provides extra functions on higher OSI layers.- Layer 3 Switching :...

 or a Domain Name System
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...


Internet-based services

One of the most common applications of load balancing is to provide a single Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

 service from multiple server
Server (computing)
In the context of client-server architecture, a server is a computer program running to serve the requests of other programs, the "clients". Thus, the "server" performs some computational task on behalf of "clients"...

s, sometimes known as a server farm
Server farm
A server farm or server cluster is a collection of computer servers usually maintained by an enterprise to accomplish server needs far beyond the capability of one machine. Server farms often have backup servers, which can take over the function of primary servers in the event of a primary server...

. Commonly, load-balanced systems include popular web sites, large Internet Relay Chat
Internet Relay Chat
Internet Relay Chat is a protocol for real-time Internet text messaging or synchronous conferencing. It is mainly designed for group communication in discussion forums, called channels, but also allows one-to-one communication via private message as well as chat and data transfer, including file...

 networks, high-bandwidth File Transfer Protocol
File Transfer Protocol
File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...

 sites, Network News Transfer Protocol
Network News Transfer Protocol
The Network News Transfer Protocol is an Internet application protocol used for transporting Usenet news articles between news servers and for reading and posting articles by end user client applications...

 (NNTP) servers and Domain Name System
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

 (DNS) servers. Lately, some load balancers evolved to support databases; these are called database load balancers.

For Internet services, the load balancer is usually a software program that is listening on the port
TCP and UDP port
In computer networking, a port is an application-specific or process-specific software construct serving as a communications endpoint in a computer's host operating system. A port is associated with an IP address of the host, as well as the type of protocol used for communication...

 where external clients connect to access services. The load balancer forwards requests to one of the "backend" servers, which usually replies to the load balancer. This allows the load balancer to reply to the client without the client ever knowing about the internal separation of functions. It also prevents clients from contacting backend servers directly, which may have security benefits by hiding the structure of the internal network and preventing attacks on the kernel's network stack or unrelated services running on other ports.

Some load balancers provide a mechanism for doing something special in the event that all backend servers are unavailable. This might include forwarding to a backup load balancer, or displaying a message regarding the outage.

An alternate method of load balancing, which does not necessarily require a dedicated software or hardware node, is called round robin DNS
Round robin DNS
Round robin DNSis a technique of load distribution, load balancing, or fault-tolerance provisioning multiple, redundant Internet Protocol service hosts, e.g., Web servers, FTP servers, by managing the Domain Name System's responses to address requests from client computers according to an...

. In this technique, multiple IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

es are associated with a single domain name
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

; clients are expected to choose which server to connect to. Unlike the use of a dedicated load balancer, this technique exposes to clients the existence of multiple backend servers. The technique has other advantages and disadvantages, depending on the degree of control over the DNS server and the granularity of load balancing desired.

Another, more effective technique for load-balancing using DNS, is to delegate www.example.org as a sub-domain whose zone is served by each of the same servers that are serving the web site. This technique works particularly well where individual servers are spread geographically on the Internet. For example,

one.example.org A
two.example.org A
www.example.org NS one.example.org
www.example.org NS two.example.org

However, the zone file for www.example.org on each server is different such that each server resolves its own IP Address as the A-record. On server one the zone file for www.example.org reports:

@ in a

On server two the same zone file contains:

@ in a

This way, when a server is down, its DNS will not respond and the web service does not receive any traffic. If the line to one server is congested, the unreliability of DNS ensures less HTTP traffic reaches that server. Furthermore, the quickest DNS response to the resolver is nearly always the one from the network's closest server, ensuring geo-sensitive load-balancing. A short TTL
Time to live
Time to live is a mechanism that limits the lifespan of data in a computer or network. TTL may be implemented as a counter or timestamp attached to or embedded in the data. Once the prescribed event count or timespan has elapsed, data is discarded. In computer networking, TTL prevents a data...

 on the A-record helps to ensure traffic is quickly diverted when a server goes down. Consideration must be given the possibility that this technique may cause individual clients to switch between individual servers in mid-session.

A variety of scheduling algorithms are used by load balancers to determine which backend server to send a request to. Simple algorithms include random choice or round robin. More sophisticated load balancers may take into account additional factors, such as a server's reported load, recent response times, up/down status (determined by a monitoring poll of some kind), number of active connections, geographic location, capabilities, or how much traffic it has recently been assigned. High-performance systems may use multiple layers of load balancing.

In addition to using dedicated hardware load balancers, software-only solutions are available, including open source options. Examples of the latter include the Apache
Apache HTTP Server
The Apache HTTP Server, commonly referred to as Apache , is web server software notable for playing a key role in the initial growth of the World Wide Web. In 2009 it became the first web server software to surpass the 100 million website milestone...

 web server's mod proxy balancer
Mod proxy
mod_proxy is an optional module for the Apache HTTP Server .This module implements a proxy/gateway/cache for Apache. It implements proxying capability for or AJP13 , FTP, CONNECT , HTTP/0.9, HTTP/1.0, and HTTP/1.1...

 extension, Varnish, or the Pound
Pound (networking)
Pound is a lightweight open source reverse proxy program and application firewall suitable to be used as a web server load balancing solution. Developed by an IT security company, it has a strong emphasis on security. The original intent on developing Pound was to allow distributing the load among...

 reverse proxy and load balancer. Gearman
Gearman is an open source application framework originally written in Perl by Brad Fitzpatrick. Brian Aker and Eric Day rewrote the framework in C. Gearman is designed to distribute appropriate computer tasks to multiple computers, so large tasks can be done more quickly...

 can be used to distribute appropriate computer tasks to multiple computers, so large tasks can be done more quickly.

In a Multitier architecture
Multitier architecture
In software engineering, multi-tier architecture is a client–server architecture in which the presentation, the application processing, and the data management are logically separate processes. For example, an application that uses middleware to service data requests between a user and a database...

, terminology for designs behind a load balancer or network dispatcher may include Bowties and Stovepipes. A stovepipe presents a situation such that a transaction that is dispatched at a top tier follows a static path through the stack of devices and software behind the load balancer to its final destination. Alternatively, if Bowties are used, at each tier the transaction could take one of many paths after being serviced by the applications at a particular tier. Network diagrams with transaction flows resemble Stovepipes or Bowties, or hybrid architectures based on need at each tier.


An important issue when operating a load-balanced service is how to handle information that must be kept across the multiple requests in a user's session. If this information is stored locally on one backend server, then subsequent requests going to different backend servers would not be able to find it. This might be cached information that can be recomputed, in which case load-balancing a request to a different backend server just introduces a performance issue.

One solution to the session data issue is to send all requests in a user session consistently to the same backend server. This is known as persistence or stickiness. A significant downside to this technique is its lack of automatic failover
In computing, failover is automatic switching to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network...

: if a backend server goes down, its per-session information becomes inaccessible, and any sessions depending on it are lost. The same problem is usually relevant to central database servers; even if web servers are "stateless" and not "sticky", the central database is (see below).

Assignment to a particular server might be based on a username, client IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

, or by random assignment. Because of changes of the client's perceived address resulting from DHCP, network address translation
Network address translation
In computer networking, network address translation is the process of modifying IP address information in IP packet headers while in transit across a traffic routing device....

, and web proxies this method may be unreliable. Random assignments must be remembered by the load balancer, which creates a burden on storage. If the load balancer is replaced or fails, this information may be lost, and assignments may need to be deleted after a timeout period or during periods of high load to avoid exceeding the space available for the assignment table. The random assignment method also requires that clients maintain some state, which can be a problem, for example when a web browser has disabled storage of cookies. Sophisticated load balancers use multiple persistence techniques to avoid some of the shortcomings of any one method.

Another solution is to keep the per-session data in a database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

. Generally this is bad for performance since it increases the load on the database: the database is best used to store information less transient than per-session data. To prevent a database from becoming a single point of failure
Single point of failure
A single point of failure is a part of a system that, if it fails, will stop the entire system from working. They are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system.-Overview:Systems can be made...

, and to improve scalability
In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

, the database is often replicated across multiple machines, and load balancing is used to spread the query load across those replicas. Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

's ASP.net
ASP.NET is a Web application framework developed and marketed by Microsoft to allow programmers to build dynamic Web sites, Web applications and Web services. It was first released in January 2002 with version 1.0 of the .NET Framework, and is the successor to Microsoft's Active Server Pages ...

 State Server technology is an example of a session database. All servers in a web farm store their session data on State Server and any server in the farm can retrieve the data.

Fortunately there are more efficient approaches. In the very common case where the client is a web browser, per-session data can be stored in the browser itself. One technique is to use a browser cookie
HTTP cookie
A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is used for an origin website to send state information to a user's browser and for the browser to return the state information to the origin site...

, suitably time-stamped and encrypted. Another is URL rewriting
Rewrite engine
A rewrite engine is software that modifies a web URL's appearance . Rewritten URLs are used to provide shorter and more relevant-looking links to web pages...

. Storing session data on the client is generally the preferred solution: then the load balancer is free to pick any backend server to handle a request. However, this method of state-data handling is not really suitable for some complex business logic scenarios, where session state payload is very big or recomputing it with every request on a server is not feasible, and URL rewriting has major security issues, since the end-user can easily alter the submitted URL and thus change session streams. Encrypted client side cookies are arguably just as insecure since unless all transmission is over HTTPS, they are very easy to copy or decrypt for man in the middle
Man-in-the-middle attack
In cryptography, the man-in-the-middle attack , bucket-brigade attack, or sometimes Janus attack, is a form of active eavesdropping in which the attacker makes independent connections with the victims and relays messages between them, making them believe that they are talking directly to each other...


Load balancer features

Hardware and software load balancers may have a variety of special features.
  • Asymmetric load: A ratio can be manually assigned to cause some backend servers to get a greater share of the workload than others. This is sometimes used as a crude way to account for some servers having more capacity than others and may not always work as desired.
  • Priority activation: When the number of available servers drops below a certain number, or load gets too high, standby servers can be brought online
  • SSL Offload and Acceleration
    SSL acceleration
    SSL acceleration is a method of offloading the processor-intensive public key encryption algorithms involved in SSL transactions to a hardware accelerator....

    Depending on the workload, processing the encryption and authentication requirements of an SSL request can become a major part of the demand on the Web Server's CPU and as the demand increases the users will see slower response times. To remove this demand from the Web Server a Load Balancer may be used to terminate the SSL at the Load Balancer. Some Load Balancer appliances include specialized hardware to process SSL. When a Load Balancer terminates the SSL connections the requests are converted from HTTPS to HTTP in the Load Balancer before being passed to the Web Server. So long as the Load Balancer itself is not overloaded this feature will not noticeably degrade the performance perceived by the end users. The downside of this approach is that all of the SSL processing is concentrated among a single device (the Load Balancer) which can become a new bottleneck. When this feature is not used the SSL overhead is distributed among the Web Servers. For these reasons it is important to compare the total cost of a Load Balancer Appliance, which is often quite high, to that of the servers hosting the Web Servers, which are often running on inexpensive commodity servers, before deciding to use this feature. Adding a few web servers may be significantly cheaper than upgrading a Load Balancer. Also, some server vendors such as Oracle/Sun now incorporate cryptographic acceleration hardware into some models such as the T2000 which reduce the CPU burden and response time needed by SSL requests. One clear benefit to SSL offloading in the Load Balancer is that it enables the ability for the Load Balancer to do load balancing or content switching based on data in the HTTPS request.
  • Distributed Denial of Service (DDoS) attack protection: load balancers can provide features such as SYN cookies
    SYN cookies
    SYN Cookies are the key element of a technique used to guard against SYN flood attacks. Daniel J. Bernstein, the technique's primary inventor, defines SYN Cookies as "particular choices of initial TCP sequence numbers by TCP servers." In particular, the use of SYN Cookies allows a server to avoid...

     and delayed-binding (the back-end servers don't see the client until it finishes its TCP handshake) to mitigate SYN flood
    SYN flood
    A SYN flood is a form of denial-of-service attack in which an attacker sends a succession of SYN requests to a target's system in an attempt to consume enough server resources to make the system unresponsive to legitimate traffic.-Technical details:...

     attacks and generally offload work from the servers to a more efficient platform.
  • HTTP compression
    Http compression
    HTTP compression is a capability that can be built into web servers and web clients to make better use of available bandwidth , and provide faster transmission speeds between both...

    reduces amount of data to be transferred for HTTP objects by utilizing gzip compression available in all modern web browsers. The larger the response and the further away the client is the more this feature will improve response times. The tradeoff is that this feature puts additional CPU demand on the Load Balancer and it is a feature which could be done by web servers instead.
  • TCP offload: different vendors use different terms for this, but the idea is that normally each HTTP request from each client is a different TCP connection. This feature utilizes HTTP/1.1 to consolidate multiple HTTP requests from multiple clients into a single TCP socket to the back-end servers.
  • TCP buffering: the load balancer can buffer responses from the server and spoon-feed the data out to slow clients, allowing the server to move on to other tasks.
  • Direct Server Return: an option for asymmetrical load distribution, where request and reply have different network paths.
  • Health checking: the balancer will poll servers for application layer health and remove failed servers from the pool.
  • HTTP caching: the load balancer can store static content so that some requests can be handled without contacting the web servers.
  • Content filtering: some load balancers can arbitrarily modify traffic on the way through.
  • HTTP security: some load balancers can hide HTTP error pages, remove server identification headers from HTTP responses, and encrypt cookies so end users can't manipulate them.
  • Priority queuing: also known as rate shaping, the ability to give different priority to different traffic.
  • Content-aware switching: most load balancers can send requests to different servers based on the URL being requested.
  • Client authentication: authenticate users against a variety of authentication sources before allowing them access to a website.
  • Programmatic traffic manipulation: at least one load balancer allows the use of a scripting language to allow custom load balancing methods, arbitrary traffic manipulations, and more.
  • Firewall: direct connections to backend servers are prevented, for network security reasons
  • Intrusion prevention system: offer application layer security in addition to network/transport layer offered by firewall security.

Use in telecommunications

Load balancing can be useful in applications with redundant communications links. For example, a company may have multiple Internet connections ensuring network access if one of the connections fails.

A failover
In computing, failover is automatic switching to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network...

 arrangement would mean that one link is designated for normal use, while the second link is used only if the primary link fails.

Using load balancing, both links can be in use all the time. A device or program monitors the availability of all links and selects the path for sending packets. Use of multiple links simultaneously increases the available bandwidth.

Many telecommunications companies have multiple routes through their networks or to external networks. They use sophisticated load balancing to shift traffic from one path to another to avoid network congestion
Network congestion
In data networking and queueing theory, network congestion occurs when a link or node is carrying so much data that its quality of service deteriorates. Typical effects include queueing delay, packet loss or the blocking of new connections...

 on any particular link, and sometimes to minimize the cost of transit across external networks or improve network reliability
Reliability (computer networking)
In computer networking, a reliable protocol is one that provides reliability properties with respect to the delivery of data to the intended recipient, as opposed to an unreliable protocol, which does not provide notifications to the sender as to the delivery of transmitted data.A reliable...

Another way of using load balancing is in network monitoring activities. Load balancers can be used to split huge data flows into several subflows and use several network analyzers, each reading a part of the original data. This is very useful for monitoring fast networks like 10GbE or STM64, where complex processing of the data may not be possible at wire speed.

Relationship to failover

Load balancing is often used to implement failover
In computing, failover is automatic switching to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network...

 — the continuation of a service after the failure of one or more of its components. The components are monitored continually (e.g., web servers may be monitored by fetching known pages), and when one becomes non-responsive, the load balancer is informed and no longer sends traffic to it. And when a component comes back on line, the load balancer begins to route traffic to it again. For this to work, there must be at least one component in excess of the service's capacity. This is much less expensive and more flexible than failover approaches where a single live component is paired with a single backup component that takes over in the event of a failure. Some types of RAID
RAID is a storage technology that combines multiple disk drive components into a logical unit...

 systems can also utilize hot spare
Hot spare
A hot spare or hot standby is used as a failover mechanism to provide reliability in system configurations. The hot spare is active and connected as part of a working system. When a key component fails, the hot spare is switched into operation...

 for a similar effect.


  • A10 Networks
    A10 Networks
    A10 Networks is a privately held company specializing in the manufacture of application delivery controllers . Founded in 2004 by Lee Chen, co-founder of Foundry Networks, A10 originally serviced just the identity management market with its line of ID Series products...

  • Array Networks
    Array Networks
    Founded in 2000, and headquartered in Silicon Valley, California, Array Networks is a global technology company that addresses problems related to securely delivering enterprise applications to end users....

  • Avaya
    Avaya Inc. is a privately held computer networking, information technology and telecommunications company that is a global provider of business communications systems. The international head quarters is in Basking Ridge, New Jersey, United States...

  • Barracuda Networks
    Barracuda Networks
    Barracuda Networks, Inc. is a privately held company providing security, networking and storage solutions based on appliances and cloud services. The company’s security products include solutions for protection against email, web surfing, web hackers and instant messaging threats such as spam,...

  • Brocade Communications Systems
    Brocade Communications Systems
    Brocade Communications Systems, Inc. , based in Silicon Valley , is a vendor of storage area network hardware and software. The company also designs, manufactures, and sells networking products and management applications for local, metro, and wide area networks...

  • CAI Networks
    CAI Networks
    CAI Networks, Inc. is a privately held company providing network products for e-commerce, government, and IT industries. It was established in 1998 and since January 2000 is based in Santa Ana, California. It has engineering offices in the USA, UK, Taiwan, and China. It has thousands customers in...

  • Cisco Systems
    Cisco Systems
    Cisco Systems, Inc. is an American multinational corporation headquartered in San Jose, California, United States, that designs and sells consumer electronics, networking, voice, and communications technology and services. Cisco has more than 70,000 employees and annual revenue of US$...

  • Citrix Systems
    Citrix Systems
    Citrix Systems, Inc. is a multinational corporation founded in 1989, that provides server and desktop virtualization, networking, software-as-a-service , and cloud computing technologies, including Xen open source products....

  • Coyote Point Systems
    Coyote Point Systems
    Coyote Point Systems is a manufacturer of computer networking equipment for application traffic management, also known as server load balancing....

  • Crescendo Networks
    Crescendo Networks
    Crescendo Networks, Ltd. was a privately held computer networking company headquartered in Sunnyvale, California with regional offices in EMEA and APAC...

  • Double-Take (NSI Product)
  • F5 Networks
    F5 Networks
    F5 Networks, Inc. is a networking appliances company. It is headquartered in Seattle, Washington and has development and marketing offices worldwide. It originally manufactured and sold some of the very first load balancing products...

  • Foundry Networks
    Foundry Networks
    Foundry Networks, Inc. was a networking hardware vendor selling high-end Ethernet switches and routers. The company was founded in 1996 by Bobby R. Johnson, Jr. and was headquartered in Santa Clara, California, USA...

  • Inlab Software GmbH
    Inlab Software GmbH
    Inlab Software GmbH is an independent software vendor located in Grünwald, Germany. It develops and markets load balancing software, networking system software, and programming languages....

  • LoadBalancer.org
  • Nortel Networks
  • PineApp
    PineApp is a privately held IT security company that provides email security, email archiving and web filtering for organizations and enterprises. The products are provided as appliances or software, as well as Software as a Service and cloud service platforms...

    PIOLINK is a leading application networking company in Korea.It manufactures and sells some application switches and Web Application Firewall products modeled PAS and WEBFRONT series in providing solutions to help individuals and enterprises, assures the stability, scalability, speed, and...

  • Radware
    Radware , is a provider of integrated Application delivery, Network Security and Load balancing solutions based in Tel Aviv, Israel. Radware, which is a member of the Rad Group of companies, is a public company and its shares are traded on NASDAQ.- History :...

  • Resonate
  • Stonesoft
  • Strangeloop Networks
    Strangeloop Networks
    Strangeloop Networks Inc is a company that develops front-end website optimization technology. The company's flagship product is the Strangeloop Site Optimizer, technology that automatically streamlines web page HTML code and resources, allowing pages to render faster at the user's browser level...

  • Zeus Technology
    Zeus Technology
    Zeus Technology, Ltd. is a software company based in Cambridge, England. Zeus Technology, Inc. is a wholly owned US subsidiary.- Timeline :...

See also

  • Application Delivery Controller
    Application delivery controller
    An application delivery controller is a network device in the datacenter that helps perform common tasks done by web sites in an effort to remove load from the web servers themselves. Many also provide load balancing. They usually sit between the firewall/router and the web farm. The ADC is in...

  • Cloud computing
    Cloud computing
    Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

  • Edge computing
    Edge computing
    Edge computing provides application processing load balancing capacity to corporate and other large-scale web servers. It is like an application cache, where the cache is in the Internet itself. Static web-sites being cached on mirror sites is not a new concept...

  • Common Address Redundancy Protocol
    Common Address Redundancy Protocol
    The Common Address Redundancy Protocol or CARP is a protocol which allows multiple hosts on the same local network to share a set of IP addresses. Its primary purpose is to provide failover redundancy, especially when used with firewalls and routers. In some configurations CARP can also provide...

  • Network Load Balancing Services
    Network Load Balancing Services
    Network Load Balancing Services is a Microsoft implementation of clustering and load balancing that is intended to provide high availability and high reliability, as well as high scalability. NLBS is intended for applications with relatively small data sets that rarely change , and do not have...

  • Processor affinity
    Processor affinity
    Processor affinity is a modification of the native central queue scheduling algorithm in a symmetric multiprocessing operating system. Each task in the queue has a tag indicating its preferred / kin processor...

  • Affinity mask
    Affinity mask
    An affinity mask is a bit mask indicating what processor a thread or process should be run on by the scheduler of an operating system. Setting the affinity mask for certain processes running under Windows can be useful as there are several system processes that are restricted to the first CPU / Core...

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.