Link aggregation
Encyclopedia
Link aggregation or trunking or link bundling or Ethernet/network/NIC bonding or NIC teaming are computer networking umbrella terms to describe various methods of combining (aggregating) multiple network connections in parallel to increase throughput
Throughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...

 beyond what a single connection could sustain, and to provide redundancy
Redundancy (engineering)
In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe....

 in case one of the links fails.

These umbrella terms not only encompass vendor-independent standards such as IEEE 802.3ad Link Aggregation Control Protocol (LACP) for wired Ethernet, and the physical medium independent IEEE 802.1ax, but also various proprietary solutions.

Aggregation can be implemented at any of the lowest three layers of the OSI model
OSI model
The Open Systems Interconnection model is a product of the Open Systems Interconnection effort at the International Organization for Standardization. It is a prescription of characterizing and standardizing the functions of a communications system in terms of abstraction layers. Similar...

. Commonplace examples of aggregation at layer 1 are power line
Power line communication
Power line communication or power line carrier , also known as power line digital subscriber line , mains communication, power line telecom , power line networking , or broadband over power lines are systems for carrying data on a conductor also used for electric power transmission.A wide range...

 (e.g. IEEE 1901) and wireless
Wireless LAN
A wireless local area network links two or more devices using some wireless distribution method , and usually providing a connection through an access point to the wider internet. This gives users the mobility to move around within a local coverage area and still be connected to the network...

 (e.g. IEEE 802.11) network devices that combine multiple frequency bands into a single wider one. Layer 2 (data link layer
Data link layer
The data link layer is layer 2 of the seven-layer OSI model of computer networking. It corresponds to, or is part of the link layer of the TCP/IP reference model....

, e.g. Ethernet frame
Ethernet frame
A data packet on an Ethernet link is called an Ethernet frame. A frame begins with Preamble and Start Frame Delimiter. Following which, each Ethernet frame continues with an Ethernet header featuring destination and source MAC addresses. The middle section of the frame is payload data including any...

 in LANs or multi-link PPP in WANs) aggregation typically occurs across switch ports, which can be either physical ports, or virtual ones managed by an operating system, e.g. such as those of Open vSwitch. Aggregation is also possible at layer 3 in the OSI model, i.e. at the network layer (e.g. IP or IPX), using round-robin scheduling, or based on hash values computed from fields in the packet header, or a combination of these two methods. Regardless of the layer on which aggregation occurs, the network load is balanced across all links. Most methods provide failover/redundancy as well.

Combining can either occur such that multiple interfaces share one logical address (i.e. MAC or IP), or it can be done such that each interface has its own address. The former requires that both ends of a link use the same aggregation method, but has performance advantages over the latter.

Description

Link aggregation addresses two problems with Ethernet
Ethernet
Ethernet is a family of computer networking technologies for local area networks commercially introduced in 1980. Standardized in IEEE 802.3, Ethernet has largely replaced competing wired LAN technologies....

 connections: bandwidth limitations and lack of resilience.

With regard to the first issue: bandwidth requirements do not scale linearly. Ethernet bandwidths historically have increased by an order of magnitude
Order of magnitude
An order of magnitude is the class of scale or magnitude of any amount, where each class contains values of a fixed ratio to the class preceding it. In its most common usage, the amount being scaled is 10 and the scale is the exponent being applied to this amount...

 each generation: 10 Megabit
Megabit
The megabit is a multiple of the unit bit for digital information or computer storage. The prefix mega is defined in the International System of Units as a multiplier of 106 , and therefore...

/s, 100 Mbit/s, 1000 Mbit/s, 10,000 Mbit/s. If one started to bump into bandwidth ceilings, then the only option was to move to the next generation which could be cost prohibitive. An alternative solution, introduced by many of the network manufacturers in the early 1990s, is to combine two physical Ethernet links into one logical link via channel bonding
Channel bonding
Channel bonding is a computer networking arrangement in which two or more network interfaces on a host computer are combined for redundancy or increased throughput....

. Most of these solutions required manual configuration and identical equipment on both sides of the aggregation.

The second problem involves the three single points of failure
Single point of failure
A single point of failure is a part of a system that, if it fails, will stop the entire system from working. They are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system.-Overview:Systems can be made...

 in a typical port-cable-port connection. In either the usual computer-to-switch or in a switch-to-switch configuration, the cable itself or either of the ports the cable is plugged into can fail. Multiple physical connections can be made, but many of the higher level protocols
OSI model
The Open Systems Interconnection model is a product of the Open Systems Interconnection effort at the International Organization for Standardization. It is a prescription of characterizing and standardizing the functions of a communications system in terms of abstraction layers. Similar...

 were not designed to failover
Failover
In computing, failover is automatic switching to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network...

 completely seamlessly.

Standardization process

By the mid 1990s, most network switch manufacturers had included aggregation capability as a proprietary extension to increase bandwidth between their switches. But each manufacturer developed its own method, which led to compatibility problems. The IEEE 802.3 group took up a study group to create an inter-operable link layer
Link Layer
In computer networking, the link layer is the lowest layer in the Internet Protocol Suite , the networking architecture of the Internet . It is the group of methods or protocols that only operate on a host's link...

 standard in a November 1997 meeting. The group quickly agreed to include an automatic configuration feature which would add in redundancy as well. This became known as "Link Aggregation Control Protocol".

Initial release 802.3ad in 2000

most gigabit channel-bonding schemes use the IEEE standard of Link Aggregation which was formerly clause 43 of the IEEE 802.3
IEEE 802.3
IEEE 802.3 is a working group and a collection of IEEE standards produced by the working group defining the physical layer and data link layer's media access control of wired Ethernet. This is generally a local area network technology with some wide area network applications...

 standard added in March 2000 by the IEEE 802.3ad task force. Nearly every network equipment manufacturer quickly adopted this joint standard over their proprietary standards.

Move to 802.1 layer in 2008

David Law noted in 2006 that certain 802.1 layers (such as 802.1X security) were positioned in the protocol stack
Protocol stack
The protocol stack is an implementation of a computer networking protocol suite. The terms are often used interchangeably. Strictly speaking, the suite is the definition of the protocols, and the stack is the software implementation of them....

 above Link Aggregation which was defined as an 802.3 sublayer.
This discrepancy was resolved with formal transfer of the protocol to the 802.1 group with the publication of IEEE 802.1AX-2008 on 3 November 2008.

Link Aggregation Control Protocol

Within the IEEE specification the Link Aggregation Control Protocol (LACP) provides a method to control the bundling of several physical ports
Computer port (hardware)
In computer hardware, a port serves as an interface between the computer and other computers or peripheral devices. Physically, a port is a specialized outlet on a piece of equipment to which a plug or cable connects...

 together to form a single logical channel. LACP allows a network device to negotiate an automatic bundling of links by sending LACP packets to the peer (directly connected device that also implements LACP).

Advantages over static configuration

  • Failover
    Failover
    In computing, failover is automatic switching to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network...

     when a link fails and there is (for example) a Media Converter
    Fiber media converter
    Fiber media converters are simple networking devices that make it possible to connect two dissimilar media types such as twisted pair with fiber optic cabling. They were introduced to the industry nearly two decades ago, and are important in interconnecting fiber optic cabling-based systems with...

     between the devices which means that the peer will not see the link down. With static link aggregation the peer would continue sending traffic down the link causing it to be lost.
  • The device can confirm that the configuration at the other end can handle link aggregation. With Static link aggregation a cabling or configuration mistake could go undetected and cause undesirable network behavior.

Practical notes

LACP works by sending frames (LACPDUs) down all links that have the protocol enabled. If it finds a device on the other end of the link that also has LACP enabled, it will also independently send frames along the same links enabling the two units to detect multiple links between themselves and then combine them into a single logical link. LACP can be configured in one of two modes: active or passive. In active mode it will always send frames along the configured links. In passive mode however, it acts as "speak when spoken to", and therefore can be used as a way of controlling accidental loops (as long as the other device is in active mode).

Proprietary link aggregation

In addition to the IEEE link aggregation substandards, there are a number of proprietary aggregation schemes including Cisco's EtherChannel
EtherChannel
EtherChannel is a port link aggregation technology or port-channel architecture used primarily on Cisco switches. It allows grouping of several physical Ethernet links to create one logical Ethernet link for the purpose of providing fault-tolerance and high-speed links between switches, routers and...

, Dynamic Trunking Protocol
Dynamic Trunking Protocol
The Dynamic Trunking Protocol is a proprietary networking protocol developed by Cisco Systems for the purpose of negotiating trunking on a link between two VLAN-aware switches, and for negotiating the type of trunking encapsulation to be used. It works on the Layer 2 of the OSI model...

 and Port Aggregation Protocol
Port Aggregation Protocol
Port Aggregation Protocol is a Cisco Systems proprietary networking protocol, which is used for the automated, logical aggregation of Ethernet switch ports, known as an etherchannel. This means it can only be used between Cisco switches and/or switches from licensed vendors...

, Nortel's Multi-link trunking
Multi-link trunking
Multi-Link Trunking is a link aggregation or IEEE 802.3ad port trunking technology designed by Nortel . It allows grouping several physical Ethernet links into one logical Ethernet link to provide fault-tolerance and high-speed links between routers, switches, and servers...

, Split Multi-Link Trunking, Routed Split Multi-Link Trunking
R-SMLT
Routed-SMLT is a computer networking protocol designed by Nortel as an enhancement to SMLT enabling the exchange of Layer 3 information between peer nodes in a Switch Cluster for unparalleled resiliency and simplicity for both L3 and L2.In many cases, core network convergence-times after a...

 and Distributed Split Multi-Link Trunking
Distributed Split Multi-Link Trunking
Distributed Split Multi-Link Trunking or Distributed SMLT is a computer networking technology designed by Avaya to enhance the Split Multi-Link Trunking protocol...

, ZTE's "Smartgroup", or Huawei's "EtherTrunk". Most high-end network devices support some kind of link aggregation, and software-based implementations – such as the *BSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

 lagg package, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

' bonding driver, Solaris' dladm etc. – also exist for many operating systems.

Aggregation Modes in Linux (Bonding Modes)

The Linux bonding driver provides a method for aggregating multiple network interfaces into a single logical bonded interface. The behavior of the bonded interfaces depends upon the mode; generally speaking, modes provide either hot standby or load balancing services.
Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.
Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. This mode provides fault tolerance. The primary option affects the behavior of this mode.
XOR policy: Transmit based on [(source MAC address XOR'd with destination MAC address) modulo slave count]. This selects the same slave for each destination MAC address. This mode provides load balancing and fault tolerance.
Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.
IEEE 802.3ad Dynamic link aggregation: Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.
Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.
Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.

Network backbone

Link aggregation offers an inexpensive way to set up a high-speed backbone network
Backbone network
A backbone network or network backbone is a part of computer network infrastructure that interconnects various pieces of network, providing a path for the exchange of information between different LANs or subnetworks. A backbone can tie together diverse networks in the same building, in different...

 that transfers much more data than any one single port or device can deliver. Although, in the past, various vendors used proprietary techniques, the preference today is to use the IEEE standard, which can either be set up statically or by using the Link Aggregation Control Protocol (LACP). This allows several devices to communicate simultaneously at their full single-port speed while not allowing any one single device to monopolize all available backbone capacity.

The actual benefits vary based on the load-balancing method used on each device (administrators can configure different balancing algorithms at each end and this is actually encouraged to avoid path polarization). Link aggregation also allows the network's backbone speed to grow incrementally as demand on the network increases, without having to replace everything and buy new hardware.

Most backbone installations install more cabling or fiber optic pairs than is initially necessary, even if they have no immediate need for the additional cabling. This is done because labor costs are higher than the cost of the cable, and running extra cable reduces future labor costs if networking needs change. Link aggregation can allow the use of these extra cables to increase backbone speeds for little or no extra cost if ports are available.

Order of frames

When balancing traffic, network administrators often wish to avoid reordering Ethernet frames. For example, TCP suffers additional overhead when dealing with out-of-order packets. This goal is approximated by sending all frames associated with a particular session across the same link. The most common implementations use L3 hashes (i.e. based on the IP address), ensuring that the same flow is always sent via the same physical link.

However, depending on the traffic, this may not provide even distribution across the links in the trunk. It effectively limits the client bandwidth in an aggregate to its single member's maximum bandwidth per session. Principally for this reason 50/50 load balancing is almost never reached in real-life implementations; around 70/30 is more usual. Advanced switches can employ an L4 hash (i.e. using TCP/UDP port numbers), which will bring the balance closer to 50/50 as different L4 flows between two hosts can make use of different physical links.

Efficiency of equipment

Aggregation becomes inefficient beyond a certain bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

 — depending on the total number of ports
Computer port (hardware)
In computer hardware, a port serves as an interface between the computer and other computers or peripheral devices. Physically, a port is a specialized outlet on a piece of equipment to which a plug or cable connects...

 on the switch equipment. A 24-port gigabit switch with two 8-gigabit trunks is using sixteen of its available ports just for the two interswitch connections, and leaves only eight of its 1-gigabit ports for other devices. This same configuration on a 48-port gigabit switch leaves 32 1-gigabit ports available, and so it is much more efficient (assuming of course that those ports are actually needed at the switch location).

When a switch utilizes 40-50% of its ports for backbone trunking, upgrading to a switch with either more ports or a higher base-operating speed may be a better option than simply adding more switches, especially if the old switch can be re-used elsewhere on a less performance-critical part of the network.

Use on network interface cards

Network interface cards (NICs) trunked together can also provide network links beyond the throughput of any one single NIC. For example, this allows a central file server to establish an aggregate 2-gigabit connection using two 1-gigabit NICs trunked together. Note the data signaling rate will still be 1Gbit/s, which can be misleading depending on methodologies used to test throughput after link aggregation is employed.

Note that Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 does not natively support link aggregation (at least up to Windows Server 2008). However, some manufacturers provide software for aggregation on their multiport NICs at the device-driver
Device driver
In computing, a device driver or software driver is a computer program allowing higher-level computer programs to interact with a hardware device....

 layer. Intel, for example, has released a package for Windows called Advanced Networking Services (ANS) to bind Intel Fast Ethernet and Gigabit cards. Nvidia also supports "teaming" with their Nvidia Network Access Manager/Firewall Tool. HP also has a very robust teaming tool for HP branded NICs which will allow for non-etherchanneled NIC teaming or which will also support several modes of etherchannel (port aggregation) including 802.3ad with LACP.

Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

, FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

, NetBSD
NetBSD
NetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...

, OpenBSD
OpenBSD
OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...

, Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

, OpenSolaris
OpenSolaris
OpenSolaris was an open source computer operating system based on Solaris created by Sun Microsystems. It was also the name of the project initiated by Sun to build a developer and user community around the software...

, Citrix XenServer
Xen
Xen is a virtual-machine monitor providing services that allow multiple computer operating systems to execute on the same computer hardware concurrently....

, VMware ESX
VMware ESX
VMware ESX is an enterprise-level computer virtualization product offered by VMware, Inc. ESX is a component of VMware's larger offering, VMware Infrastructure, and adds management and reliability services to the core server product...

, and commercial Unix distributions such as AIX implement Ethernet bonding (trunking) at a higher level, and can hence deal with NICs from different manufacturers or drivers, as long as the NIC is supported by the kernel.

Single switch

With modes balance-rr, balance-xor, broadcast and 802.3ad all physical ports in the link aggregation group must reside on the same logical switch, which in most scenarios will leave a single point of failure when the physical switch to which both links are connected goes offline. Modes active-backup, balance-tlb, and balance-alb can also be set up with two or more switches. But after failover (like all other modes), in some cases, active sessions may fail (due to arp problems) and have to be restarted.

However, almost all vendors have proprietary extensions that resolve some of this issue: they aggregate multiple physical switches into one logical switch. , the IEEE has not yet committed resources to standardize this feature. The SMLT protocol allows multiple Ethernet links to be split across two devices, preventing any single point of failure, and additionally allowing the load to be balanced across the 2 aggregation switches from the single access system. These devices synchronize state across an Inter-Switch Trunk (IST) such that they appear to the connecting (access) device to be a single device (switch block) and prevent any packet duplication. SMLT's provide enhanced resiliency with sub-second failover and sub-second recovery for all speed trunks (10 Mbit/s, 100 Mbit/s, 1,000 Mbit/s, and 10 Gbit/s) while operating transparently to end-devices.

Same link speed

In most implementations, all the ports used in an aggregation consist of the same physical type, such as all copper ports (10/100/1000BASE‑T), all multi-mode fiber ports, or all single-mode fiber ports. However, all the IEEE standard requires is that each link be full duplex and all of them have an identical speed (10, 100, 1,000 or 10,000 Mbit/s).

Many switches are PHY independent, meaning that a switch could have a mixture of copper, SX, LX, LX10 or other GBICs. While maintaining the same PHY is the usual approach, it is possible to aggregate a 1000BASE-SX fiber for one link and a 1000BASE-LX (longer, diverse path) for the second link, but the important thing is that the speed will be 1 Gbit/s full duplex for both links. One path may have a slightly longer transit time but the standard has been engineered so this will not cause an issue.

Ethernet aggregation mismatch

Aggregation mismatch refers to not matching the aggregation type on both ends of the link. Some switches do not implement the 802.1AX standard but support static configuration of link aggregation. Therefore link aggregation between similarly statically configured switches will work, but will fail between a statically configured switch and a device that is configured for LACP.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK