VEX prefix
Encyclopedia
The VEX prefix and VEX coding scheme is a proposed future extension to the x86 instruction set architecture for microprocessor
s from Intel, AMD
and others.
s which is added to existing or new instruction codes.
The VEX prefix replaces the most commonly used instruction prefix bytes and escape codes
. In many cases, the number of prefix bytes and escape bytes that are replaced is the same as the number of bytes in the VEX prefix, so that the total length of the VEX-encoded instruction is the same as the length of the legacy instruction code. In other cases, the VEX-encoded version is longer or shorter than the legacy code.
The VEX prefix contains the following components:
The three-byte VEX prefix additionally contains:
The encoding is as follows:
The VEX opcode bytes are the same as that used by the LDS and LES instructions. These instructions are not supported in 64-bit mode, while in 32-bit mode a following ModRM byte can not be of the form 11xxxxxx (which would specify a register operand). Various bits are inverted to ensure that the second byte of a VEX prefix is always of this form in 32-bit mode.
Instructions that need more than three operands have an extra suffix
byte specifying one or two additional register operands. Instructions coded with the VEX prefix can have up to five operands. At most one of the operands can be a memory operand; and at most one of the operands can be an immediate constant of 4 or 8 bits. The remaining operands are registers.
The AVX
instruction set is the first instruction set extension to use the VEX coding scheme. The AVX instructions have up to four operands. The AVX instruction set allows the VEX prefix to be applied only to instructions using the SIMD XMM
registers. However, the VEX coding scheme has space for applying the VEX prefix to other instructions as well in future instruction sets.
Legacy SIMD instructions with a VEX prefix added are equivalent to the same instructions without VEX prefix with the following differences:
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency.
Microprocessor
A microprocessor incorporates the functions of a computer's central processing unit on a single integrated circuit, or at most a few integrated circuits. It is a multipurpose, programmable device that accepts digital data as input, processes it according to instructions stored in its memory, and...
s from Intel, AMD
Advanced Micro Devices
Advanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...
and others.
Features
The proposed VEX coding scheme extends the existing x86 instruction set architecture to allow the definition of new instructions and the extension or modification of previously existing instruction codes. This serves the following purposes:- The opcodeOpcodeIn computer science engineering, an opcode is the portion of a machine language instruction that specifies the operation to be performed. Their specification and format are laid out in the instruction set architecture of the processor in question...
map is extended to make space for future instructions. - It allows instruction codes to have up to five operands, where the original scheme allows only two operands (in rare cases three operands).
- It allows the size of SIMDSIMDSingle instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...
vectorVector processorA vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...
registersProcessor registerIn computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...
to be extended from the 128-bitBitA bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...
s XMM registers to 256-bits registers named YMM. There is room for further extensions of the register size in the future. - It allows existing two-operand instructions to be modified into non-destructive three-operand forms where the destination register is different from both source registers. For example c = a + b instead of a = a + b (where register a is changed by the instruction).
Technical description
The proposed VEX coding scheme uses a code prefix consisting of 2 or 3 byteByte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...
s which is added to existing or new instruction codes.
The VEX prefix replaces the most commonly used instruction prefix bytes and escape codes
Escape sequence
An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control. Some control sequences are special characters that always have the same meaning...
. In many cases, the number of prefix bytes and escape bytes that are replaced is the same as the number of bytes in the VEX prefix, so that the total length of the VEX-encoded instruction is the same as the length of the legacy instruction code. In other cases, the VEX-encoded version is longer or shorter than the legacy code.
The VEX prefix contains the following components:
- The bit, R, contained in the REX prefix used in the x86-64X86-64x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...
instruction set extension. - Four bits named v, specifying an second source register operand.
- A bit named L specifying 256-bit vector length.
- Two bits named p to replace operand size prefixes and operand type prefixes (66, F2, F3).
The three-byte VEX prefix additionally contains:
- The three bits, X; B; and W, also contained in the REX prefix used in the x86-64X86-64x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...
instruction set extension. - Five bits named m. Two of the m bits are used for replacing existing escape codes and for specifying the length of the instruction. The remaining three m bits are reserved for future use, such as specifying vector lengths >256 bits, specifying different instruction lengths, or extending the opcode space.
The encoding is as follows:
First byte | Second byte | Third byte | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
3-byte VEX | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | R̅ | X̅ | B̅ | m4 | m3 | m2 | m1 | m0 | W | L | p1 | p0 | ||||
2-byte VEX | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | R̅ | L | p1 | p0 |
The VEX opcode bytes are the same as that used by the LDS and LES instructions. These instructions are not supported in 64-bit mode, while in 32-bit mode a following ModRM byte can not be of the form 11xxxxxx (which would specify a register operand). Various bits are inverted to ensure that the second byte of a VEX prefix is always of this form in 32-bit mode.
- R̅, X̅ and B̅ bits are inversion of the REX prefix's R, X and B bits; these provide a fourth (high) bit for register index fields (ModRM reg, SIB index, and ModRM r/m; SIB base; or opcode reg fields, respectively) allowing access to 16 instead of 8 registers. The W bit is equivalent to the REX prefix's W bit, and specifies a 64-bit operand; for non-integer instructions, it is a general opcode extension bit.
- v̅ is the inversion of an additional source register index.
- m replaces leading opcode prefix bytes. The values 1, 2 and 3 are equivalent to opcode prefixes 0F, 0F 38 and 0F 3A; all other values are currently reserved. The 2-byte VEX prefix always corresponds to the 0F prefix.
- L indicates the vector length; 0 for 128-bit SSE (XMM) registers, and 1 for 256-bit AVX (YMM) registers.
- p encodes additional prefix bytes. The values 0, 1, 2, and 3 correspond to implied prefixes of none, 66, F3, and F2. These encode the operand type for SSE instructions: packed single, packed double, scalar single and scalar double, respectively.
Instructions that need more than three operands have an extra suffix
Suffix
In linguistics, a suffix is an affix which is placed after the stem of a word. Common examples are case endings, which indicate the grammatical case of nouns or adjectives, and verb endings, which form the conjugation of verbs...
byte specifying one or two additional register operands. Instructions coded with the VEX prefix can have up to five operands. At most one of the operands can be a memory operand; and at most one of the operands can be an immediate constant of 4 or 8 bits. The remaining operands are registers.
The AVX
Advanced Vector Extensions
Advanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX...
instruction set is the first instruction set extension to use the VEX coding scheme. The AVX instructions have up to four operands. The AVX instruction set allows the VEX prefix to be applied only to instructions using the SIMD XMM
Streaming SIMD Extensions
In computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! . SSE contains 70 new instructions, most of which work on single precision floating point...
registers. However, the VEX coding scheme has space for applying the VEX prefix to other instructions as well in future instruction sets.
Legacy SIMD instructions with a VEX prefix added are equivalent to the same instructions without VEX prefix with the following differences:
- The VEX-encoded instruction can have one more operand, making it non-destructive.
- A 128-bit XMM instruction without VEX prefix leaves the upper half of the full 256-bit YMM register unchanged, while the VEX-encoded version sets the upper half to zero.
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency.
History
- In August 2007, AMDAdvanced Micro DevicesAdvanced Micro Devices, Inc. or AMD is an American multinational semiconductor company based in Sunnyvale, California, that develops computer processors and related technologies for commercial and consumer markets...
proposed the SSE5SSE5The SSE5 was an instruction set extension proposed by AMD on August 30, 2007 as a supplement to the 128-bit SSE core instructions in the AMD64 architecture....
instruction set extension which includes a new coding scheme for instructions with three operands, using an extra byte named DREX intended for the BulldozerBulldozer (processor)Bulldozer is the codename Advanced Micro Devices has given to one of the next-generation CPU cores after the K10 microarchitecture for the company's M-SPACE design methodology, with the core specifically aimed at 10-watt to 125-watt TDP computing products. Bulldozer is a completely new design...
processor core, due to begin production in 2011. - In March 2008, Intel proposed the AVXAdvanced Vector ExtensionsAdvanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX...
instruction set, using the new VEX coding scheme. - In August 2008, commentators deplored the expected incompatibility between AMD and Intel instruction sets, and proposed that AMD revise their plans and replace the DREX scheme with the more flexible and extensible VEX scheme.
- In May 2009, AMD announced a revision of the proposed SSE5 instruction set to make it compatible with the AVX instruction set and the VEX coding scheme. The revised SSE5 is called XOPXOP instruction setThe XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set for the Bulldozer processor core, which was released on October 12th, 2011....
. - January 2011. The AVX instruction set is supported in Intel's Sandy Bridge microprocessor architectureSandy Bridge (microarchitecture)Sandy Bridge is the codename for a microarchitecture developed by Intel beginning in 2005 for central processing units in computers to replace the Nehalem microarchitecture...
. - 2011. The AVXAdvanced Vector ExtensionsAdvanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX...
, XOPXOP instruction setThe XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set for the Bulldozer processor core, which was released on October 12th, 2011....
and FMA4 instruction sets, all using the VEX scheme, will be supported in the AMD BulldozerBulldozer (processor)Bulldozer is the codename Advanced Micro Devices has given to one of the next-generation CPU cores after the K10 microarchitecture for the company's M-SPACE design methodology, with the core specifically aimed at 10-watt to 125-watt TDP computing products. Bulldozer is a completely new design...
processor, according to AMD plans. - Unknown date. The FMA3 instruction set, but possibly not FMA4, will be supported in Intel processors.