Virtual Interface Architecture
Encyclopedia
The Virtual Interface Architecture (VIA) is an abstract model of a user-level zero-copy
network
, and is the basis for InfiniBand
, iWARP
and RoCE
. Created by Microsoft
, Intel, and Compaq
, the original VIA sought to standardize the interface for high-performance network technologies known as System Area Networks (SANs; not to be confused with Storage Area Network
s).
Networks are a shared resource. With traditional network APIs such as the Berkeley socket API
, the kernel is involved in every network communication. This presents a tremendous performance bottleneck when latency
is an issue.
One of the classic developments in computing systems is virtual memory
, a combination of hardware and software that creates the illusion of private memory for each process. In the same school of thought, a virtual network interface protected across process boundaries could be accessed at the user level. With this technology, the "consumer" manages its own buffers and communication schedule while the "provider" handles the protection.
Thus, the network interface card (NIC) provides a "private network" for a process, and a process is usually allowed to have multiple such networks. The virtual interface (VI) of VIA refers to this network and is merely the destination of the user's communication requests. Communication takes place over a pair of VIs, one on each of the processing nodes involved in the transmission. In "kernel-bypass" communication, the user manages its own buffers.
Another facet of traditional networks is that arriving data is placed in a pre-allocated buffer and then copied to the user-specified final destination. Copying large messages can take a long time, and so eliminating this step is beneficial. Another classic development in computing systems is direct memory access
(DMA), in which a device can access main memory directly while the CPU is free to perform other tasks.
In a network with "remote direct memory access" (RDMA), the sending NIC uses DMA to read data in the user-specified buffer and transmit it as a self-contained message across the network. The receiving NIC then uses DMA to place the data into the user-specified buffer. There is no intermediary copying and all of these actions occur without involvement of the CPUs, which has an added benefit of lower CPU utilization.
For the NIC to actually access the data through DMA, the user's page must be in memory. In VIA, the user must "pin-down" its buffers before transmission, so as to prevent the OS from swapping the page out to the disk. This action—one of the few that involves the kernel—ties the page to physical memory. To ensure that only the process that owns the registered memory may access it, the VIA NICs require permission keys known as "protection tags" during communication.
So essentially VIA is a standard that defines kernel bypassing and RDMA in a network. It also defines a programming library called "VIPL". It has been implemented, most notably in cLAN from Giganet (now Emulex). Mostly though, VIA's major contribution has been in providing a basis for the InfiniBand
, iWARP
and RoCE
standards.
Zero-copy
"Zero-copy" describes computer operations in which the CPU does not perform the task of copying data from one memory area to another. This is most often used to save on processing power and memory use when sending files over a network.- Principle :...
network
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....
, and is the basis for InfiniBand
InfiniBand
InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable...
, iWARP
IWARP
The Internet Wide Area RDMA Protocol is a computer networking protocol for transferring data efficiently.It is sometimes referred to simply as "RDMA", though RDMA is not a feature exclusive to iWARP.-History:...
and RoCE
RDMA over Converged Ethernet
RDMA over Converged Ethernet is a network protocol that allows remote direct memory access over an Ethernet network. RoCE is a link layer protocol and hence allows communication between any two hosts in the same Ethernet broadcast domain...
. Created by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
, Intel, and Compaq
Compaq
Compaq Computer Corporation is a personal computer company founded in 1982. Once the largest supplier of personal computing systems in the world, Compaq existed as an independent corporation until 2002, when it was acquired for US$25 billion by Hewlett-Packard....
, the original VIA sought to standardize the interface for high-performance network technologies known as System Area Networks (SANs; not to be confused with Storage Area Network
Storage area network
A storage area network is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices...
s).
Networks are a shared resource. With traditional network APIs such as the Berkeley socket API
Berkeley sockets
The Berkeley sockets application programming interface comprises a library for developing applications in the C programming language that perform inter-process communication, most commonly for communications across a computer network....
, the kernel is involved in every network communication. This presents a tremendous performance bottleneck when latency
Lag
Lag is a common word meaning to fail to keep up or to fall behind. In real-time applications, the term is used when the application fails to respond in a timely fashion to inputs...
is an issue.
One of the classic developments in computing systems is virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...
, a combination of hardware and software that creates the illusion of private memory for each process. In the same school of thought, a virtual network interface protected across process boundaries could be accessed at the user level. With this technology, the "consumer" manages its own buffers and communication schedule while the "provider" handles the protection.
Thus, the network interface card (NIC) provides a "private network" for a process, and a process is usually allowed to have multiple such networks. The virtual interface (VI) of VIA refers to this network and is merely the destination of the user's communication requests. Communication takes place over a pair of VIs, one on each of the processing nodes involved in the transmission. In "kernel-bypass" communication, the user manages its own buffers.
Another facet of traditional networks is that arriving data is placed in a pre-allocated buffer and then copied to the user-specified final destination. Copying large messages can take a long time, and so eliminating this step is beneficial. Another classic development in computing systems is direct memory access
Direct memory access
Direct memory access is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the central processing unit ....
(DMA), in which a device can access main memory directly while the CPU is free to perform other tasks.
In a network with "remote direct memory access" (RDMA), the sending NIC uses DMA to read data in the user-specified buffer and transmit it as a self-contained message across the network. The receiving NIC then uses DMA to place the data into the user-specified buffer. There is no intermediary copying and all of these actions occur without involvement of the CPUs, which has an added benefit of lower CPU utilization.
For the NIC to actually access the data through DMA, the user's page must be in memory. In VIA, the user must "pin-down" its buffers before transmission, so as to prevent the OS from swapping the page out to the disk. This action—one of the few that involves the kernel—ties the page to physical memory. To ensure that only the process that owns the registered memory may access it, the VIA NICs require permission keys known as "protection tags" during communication.
So essentially VIA is a standard that defines kernel bypassing and RDMA in a network. It also defines a programming library called "VIPL". It has been implemented, most notably in cLAN from Giganet (now Emulex). Mostly though, VIA's major contribution has been in providing a basis for the InfiniBand
InfiniBand
InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable...
, iWARP
IWARP
The Internet Wide Area RDMA Protocol is a computer networking protocol for transferring data efficiently.It is sometimes referred to simply as "RDMA", though RDMA is not a feature exclusive to iWARP.-History:...
and RoCE
Rocé
Rocé is a commune in the Loir-et-Cher department of central France.-See also:*Communes of the Loir-et-Cher department...
standards.
External links
- Usenix notes on VIA
- The Virtual Interface Architecture, a book from Intel