AltiVec
Encyclopedia
AltiVec is a floating point
and integer SIMD
instruction set
designed and owned by Apple, IBM and Freescale Semiconductor
, formerly the Semiconductor Products Sector of Motorola
, (the AIM alliance
), and implemented on versions of the PowerPC
including Motorola's G4
, IBM
's G5
and POWER6
processors, and P.A. Semi
's PWRficient
PA6T. AltiVec is a trademark
owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX by IBM and P.A. Semi, although IBM has recently begun using AltiVec as well.
While AltiVec refers to an instruction set, the implementations in CPUs produced by IBM and Motorola are separate in terms of logic design. To date, no IBM core has included an AltiVec logic design licensed from Motorola or vice-versa.
AltiVec is a standard part of the new Power ISA v.2.03 specification. It was never formally a part of the PowerPC architecture until this specification although it used PowerPC instruction formats and syntax and occupied the opcode space expressly allocated for such
purposes.
feature 128-bit vector registers that can represent sixteen 8-bit signed or unsigned chars, eight 16-bit signed or unsigned shorts, four 32-bit ints or four 32-bit
floating point variables. Both provide cache
-control instructions intended to minimize cache pollution
when working on streams of data.
They also exhibit important differences. Unlike SSE2
, AltiVec supports a special RGB
"pixel
" data type, but it does not operate on 64-bit double precision floats, and there is no way to move data directly between scalar and vector
registers. In keeping with the "load/store" model of the PowerPC's RISC design, the vector registers, like the scalar registers, can only be loaded from and stored to memory. However, AltiVec provides a much more complete set of "horizontal" operations that work across all the elements of a vector; the allowable combinations of data type and operations are much more complete. Thirty-two 128-bit vector registers are provided, compared to eight for SSE and SSE2 (extended to 16 in x86-64
), and most AltiVec instructions take three register operands compared to only two register/register or register/memory operands on IA-32
.
AltiVec is also unique in its support for a flexible vector permute instruction, in which each byte of a resulting vector value can be taken from any byte of either of two other vectors, parametrized by yet another vector. This allows for sophisticated manipulations in a single instruction.
Recent versions of the GNU Compiler Collection
(GCC), IBM VisualAge compiler and other compilers provide intrinsics
to access AltiVec instructions directly from C
and C++
programs. As of version 4, the GCC also includes auto-vectorization capabilities that attempt to intelligently create Altivec accelerated binaries without the need for the programmer to use intrinsics directly. The "vector" type keyword is introduced to permit the declaration of native vector types, e.g., "
Apple was the primary customer for AltiVec until Apple switched to Intel-made, x86-based CPUs in 2006. They used it to accelerate multimedia
applications such as QuickTime
, iTunes
and key parts of Apple's Mac OS X
including in the Quartz graphics compositor
. Other companies such as Adobe used AltiVec to optimize their image-processing programs such as Adobe Photoshop
. Motorola was the first to supply AltiVec enabled processors starting with their G4 line. AltiVec was also used in some embedded systems for high-performance digital signal processing.
IBM consistently left VMX out of their POWER
microprocessors, which were intended for server applications where it was not very useful. The POWER6
microprocessor, introduced in 2007, implements AltiVec. The implementation is similar to the one in 970 and Cell. The last desktop microprocessor from IBM, the PowerPC 970
(dubbed the "G5" by Apple) also implemented AltiVec with hardware similar to that of the PowerPC 7400
.
AltiVec is the standard Category.VEC part of the Power ISA v.2.03 specification.
The Cell
Broadband Engine, used in (amongst other things) the PlayStation 3
, is also AltiVec enabled.
Freescale is bringing an enhanced version of AltiVec to e6500
based QorIQ
processors.
(Xbox 360) and called this enhancement VMX128. The enhancements comprise new routines targeted at gaming (accelerating 3D graphics and game physics) and a total of 128 registers. VMX128 is not entirely compatible with VMX/Altivec, as a number of integer operations were removed to make space for the larger register file and additional application-specific operations.
is the first Power Architecture processor to implement Power ISA v2.06.
Floating point
In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...
and integer SIMD
SIMD
Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...
instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...
designed and owned by Apple, IBM and Freescale Semiconductor
Freescale Semiconductor
Freescale Semiconductor, Inc. is a producer and designer of embedded hardware, with 17 billion semiconductor chips in use around the world. The company focuses on the automotive, consumer, industrial and networking markets with its product portfolio including microprocessors, microcontrollers,...
, formerly the Semiconductor Products Sector of Motorola
Motorola
Motorola, Inc. was an American multinational telecommunications company based in Schaumburg, Illinois, which was eventually divided into two independent public companies, Motorola Mobility and Motorola Solutions on January 4, 2011, after losing $4.3 billion from 2007 to 2009...
, (the AIM alliance
AIM alliance
The AIM alliance was an alliance formed on October 2, 1991, between Apple Inc. , IBM, and Motorola to create a new computing standard based on the PowerPC architecture. The stated goal of the alliance was to challenge the dominant Wintel computing platform with a new computer design and a...
), and implemented on versions of the PowerPC
PowerPC
PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...
including Motorola's G4
PowerPC G4
PowerPC G4 is a designation used by Apple Computer to describe a fourth generation of 32-bit PowerPC microprocessors. Apple has applied this name to various processor models from Freescale, a former part of Motorola....
, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
's G5
PowerPC 970
The PowerPC 970, PowerPC 970FX, PowerPC 970GX, and PowerPC 970MP, are 64-bit Power Architecture processors from IBM introduced in 2002. When used in Apple Inc. machines, they were dubbed the PowerPC G5....
and POWER6
POWER6
The POWER6 is a microprocessor developed by IBM that implemented the Power ISA v.2.03. When it became available in systems in 2007, it succeeded the POWER5+ as IBM's flagship Power microprocessor...
processors, and P.A. Semi
P.A. Semi
P. A. Semi was a fabless semiconductor company founded in Santa Clara, California in 2003 by Daniel W. Dobberpuhl who was the lead designer for the DEC Alpha 21064 and StrongARM processors...
's PWRficient
PWRficient
PWRficient is the name of a series of microprocessors designed by P.A. Semi.PWRficient processors comply with the 64-bit Power Architecture, and are designed for high performance and extreme power efficiency...
PA6T. AltiVec is a trademark
Trademark
A trademark, trade mark, or trade-mark is a distinctive sign or indicator used by an individual, business organization, or other legal entity to identify that the products or services to consumers with which the trademark appears originate from a unique source, and to distinguish its products or...
owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX by IBM and P.A. Semi, although IBM has recently begun using AltiVec as well.
While AltiVec refers to an instruction set, the implementations in CPUs produced by IBM and Motorola are separate in terms of logic design. To date, no IBM core has included an AltiVec logic design licensed from Motorola or vice-versa.
AltiVec is a standard part of the new Power ISA v.2.03 specification. It was never formally a part of the PowerPC architecture until this specification although it used PowerPC instruction formats and syntax and occupied the opcode space expressly allocated for such
purposes.
Features and comparison to x86-64 Streaming SIMD Extensions
Both AltiVec and SSEStreaming SIMD Extensions
In computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! . SSE contains 70 new instructions, most of which work on single precision floating point...
feature 128-bit vector registers that can represent sixteen 8-bit signed or unsigned chars, eight 16-bit signed or unsigned shorts, four 32-bit ints or four 32-bit
IEEE floating-point standard
IEEE 754–1985 was an industry standard for representingfloating-pointnumbers in computers, officially adopted in 1985 and superseded in 2008 byIEEE 754-2008. During its 23 years, it was the most widely used format for...
floating point variables. Both provide cache
CPU cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...
-control instructions intended to minimize cache pollution
Cache pollution
Cache pollution describes situations where an executing computer program loads data into CPU cache unnecessarily, thus causing other needed data to be evicted from the cache into lower levels of the memory hierarchy, potentially all the way down to main memory, thus causing a performance...
when working on streams of data.
They also exhibit important differences. Unlike SSE2
SSE2
SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...
, AltiVec supports a special RGB
RGB color model
The RGB color model is an additive color model in which red, green, and blue light is added together in various ways to reproduce a broad array of colors...
"pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....
" data type, but it does not operate on 64-bit double precision floats, and there is no way to move data directly between scalar and vector
Vector processor
A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...
registers. In keeping with the "load/store" model of the PowerPC's RISC design, the vector registers, like the scalar registers, can only be loaded from and stored to memory. However, AltiVec provides a much more complete set of "horizontal" operations that work across all the elements of a vector; the allowable combinations of data type and operations are much more complete. Thirty-two 128-bit vector registers are provided, compared to eight for SSE and SSE2 (extended to 16 in x86-64
X86-64
x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...
), and most AltiVec instructions take three register operands compared to only two register/register or register/memory operands on IA-32
IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...
.
AltiVec is also unique in its support for a flexible vector permute instruction, in which each byte of a resulting vector value can be taken from any byte of either of two other vectors, parametrized by yet another vector. This allows for sophisticated manipulations in a single instruction.
Recent versions of the GNU Compiler Collection
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
(GCC), IBM VisualAge compiler and other compilers provide intrinsics
Intrinsic function
In compiler theory, an intrinsic function is a function available for use in a given language whose implementation is handled specially by the compiler. Typically, it substitutes a sequence of automatically generated instructions for the original function call, similar to an inline function...
to access AltiVec instructions directly from C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
programs. As of version 4, the GCC also includes auto-vectorization capabilities that attempt to intelligently create Altivec accelerated binaries without the need for the programmer to use intrinsics directly. The "vector" type keyword is introduced to permit the declaration of native vector types, e.g., "
vector unsigned char foo;
" declares a 128-bit vector variable named "foo" containing sixteen 8-bit unsigned chars. The full complement of arithmetic and binary operators is defined on vector types so that the normal C expression language can be used to manipulate vector variables. There are also overloaded intrinsic functions such as "vec_add
" that emit the appropriate op code based on the type of the elements within the vector, and very strong type checking is enforced. In contrast, the Intel-defined data types for IA-32 SIMD registers declare only the size of the vector register (128 or 64 bits) and in the case of a 128-bit register, whether it contains integers or floating point values. The programmer must select the appropriate intrinsic for the data types in use, e.g., "_mm_add_epi16(x,y)
" for adding two vectors containing eight 16-bit integers.Development history
AltiVec was developed between 1996 and 1998 by a collaborative project between Apple, IBM, and Motorola.Apple was the primary customer for AltiVec until Apple switched to Intel-made, x86-based CPUs in 2006. They used it to accelerate multimedia
Multimedia
Multimedia is media and content that uses a combination of different content forms. The term can be used as a noun or as an adjective describing a medium as having multiple content forms. The term is used in contrast to media which use only rudimentary computer display such as text-only, or...
applications such as QuickTime
QuickTime
QuickTime is an extensible proprietary multimedia framework developed by Apple Inc., capable of handling various formats of digital video, picture, sound, panoramic images, and interactivity. The classic version of QuickTime is available for Windows XP and later, as well as Mac OS X Leopard and...
, iTunes
ITunes
iTunes is a media player computer program, used for playing, downloading, and organizing digital music and video files on desktop computers. It can also manage contents on iPod, iPhone, iPod Touch and iPad....
and key parts of Apple's Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
including in the Quartz graphics compositor
Quartz (graphics layer)
Quartz specifically refers to a pair of Mac OS X technologies, each part of the Core Graphics framework: Quartz 2D and Quartz Compositor. It includes both a 2D renderer in Core Graphics and the composition engine that sends instructions to the graphics card...
. Other companies such as Adobe used AltiVec to optimize their image-processing programs such as Adobe Photoshop
Adobe Photoshop
Adobe Photoshop is a graphics editing program developed and published by Adobe Systems Incorporated.Adobe's 2003 "Creative Suite" rebranding led to Adobe Photoshop 8's renaming to Adobe Photoshop CS. Thus, Adobe Photoshop CS5 is the 12th major release of Adobe Photoshop...
. Motorola was the first to supply AltiVec enabled processors starting with their G4 line. AltiVec was also used in some embedded systems for high-performance digital signal processing.
IBM consistently left VMX out of their POWER
IBM POWER
POWER is a reduced instruction set computer instruction set architecture developed by IBM. The name is an acronym for Performance Optimization With Enhanced RISC....
microprocessors, which were intended for server applications where it was not very useful. The POWER6
POWER6
The POWER6 is a microprocessor developed by IBM that implemented the Power ISA v.2.03. When it became available in systems in 2007, it succeeded the POWER5+ as IBM's flagship Power microprocessor...
microprocessor, introduced in 2007, implements AltiVec. The implementation is similar to the one in 970 and Cell. The last desktop microprocessor from IBM, the PowerPC 970
PowerPC 970
The PowerPC 970, PowerPC 970FX, PowerPC 970GX, and PowerPC 970MP, are 64-bit Power Architecture processors from IBM introduced in 2002. When used in Apple Inc. machines, they were dubbed the PowerPC G5....
(dubbed the "G5" by Apple) also implemented AltiVec with hardware similar to that of the PowerPC 7400
PowerPC G4
PowerPC G4 is a designation used by Apple Computer to describe a fourth generation of 32-bit PowerPC microprocessors. Apple has applied this name to various processor models from Freescale, a former part of Motorola....
.
AltiVec is the standard Category.VEC part of the Power ISA v.2.03 specification.
The Cell
Cell microprocessor
Cell is a microprocessor architecture jointly developed by Sony, Sony Computer Entertainment, Toshiba, and IBM, an alliance known as "STI". The architectural design and first implementation were carried out at the STI Design Center in Austin, Texas over a four-year period beginning March 2001 on a...
Broadband Engine, used in (amongst other things) the PlayStation 3
PlayStation 3
The is the third home video game console produced by Sony Computer Entertainment and the successor to the PlayStation 2 as part of the PlayStation series. The PlayStation 3 competes with Microsoft's Xbox 360 and Nintendo's Wii as part of the seventh generation of video game consoles...
, is also AltiVec enabled.
Freescale is bringing an enhanced version of AltiVec to e6500
PowerPC e6500
The PowerPC e6500 is a multithreaded 64-bit Power Architecture-based microprocessor core from Freescale Semiconductor. e6500 will power the entire range of QorIQ AMP Series system on a chip processors which share the common naming scheme: "Txxxx"...
based QorIQ
QorIQ
QorIQ is a brand of Power Architecture-based communications microprocessors from Freescale. It is the evolutionary step from the PowerQUICC platform and will be built around one or more Power Architecture e500mc cores and come in five different product platforms, P1, P2, P3, P4 and P5, segmented...
processors.
VMX128
IBM enhanced VMX for use in XenonXenon (processor)
Xenon is a CPU that is used in the Xbox 360 game console. The processor, internally codenamed "Waternoose", which was named after Henry J. Waternoose III in Monsters Inc. by IBM and XCPU by Microsoft, is based on IBM's PowerPC instruction set architecture, consisting of three independent processor...
(Xbox 360) and called this enhancement VMX128. The enhancements comprise new routines targeted at gaming (accelerating 3D graphics and game physics) and a total of 128 registers. VMX128 is not entirely compatible with VMX/Altivec, as a number of integer operations were removed to make space for the larger register file and additional application-specific operations.
VSX
Power ISA v2.06 introduces the new VSX vector-scalar instructions which extend SIMD processing for the Power ISA to support up to 64 registers, with support for regular floating point, decimal floating point and vector execution. POWER7POWER7
POWER7 is a Power Architecture microprocessor released in 2010 that succeeded the POWER6. POWER7 was developed by IBM at several sites including IBM's Rochester, MN; Austin, TX; Essex Junction, Vermont; T. J. Watson Research Center, NY; Bromont, QC and Böblingen, Germany laboratories...
is the first Power Architecture processor to implement Power ISA v2.06.
Issues
In C++, the standard way of accessing AltiVec support is mutually exclusive with use of the Standard Template Libraryvector<>
class template due to the treatment of "vector" as a reserved word when the compiler does not implement the context sensitive keyword version of vector. However, it may be possible to combine them using compiler-specific workarounds; for instance, in GCC one may do #undef vector
to remove the vector
keyword, and then use the GCC-specific __vector
keyword in its place.Motorola/Freescale
- MPC7400
- MPC7410
- MPC7450
- MPC7445/7455
- MPC7447/7447A/7457
- MPC7448
- MPC8641/8641D
- MPC8640/8640D
- MPC8610
- QorIQ T4240
IBM
- PowerPC 970PowerPC 970The PowerPC 970, PowerPC 970FX, PowerPC 970GX, and PowerPC 970MP, are 64-bit Power Architecture processors from IBM introduced in 2002. When used in Apple Inc. machines, they were dubbed the PowerPC G5....
- PowerPC 970FX
- PowerPC 970MP
- XenonXenon (processor)Xenon is a CPU that is used in the Xbox 360 game console. The processor, internally codenamed "Waternoose", which was named after Henry J. Waternoose III in Monsters Inc. by IBM and XCPU by Microsoft, is based on IBM's PowerPC instruction set architecture, consisting of three independent processor...
- Cell B.E.Cell (microprocessor)Cell is a microprocessor architecture jointly developed by Sony, Sony Computer Entertainment, Toshiba, and IBM, an alliance known as "STI". The architectural design and first implementation were carried out at the STI Design Center in Austin, Texas over a four-year period beginning March 2001 on a...
- PowerXCell 8i
- POWER6/POWER6+POWER6The POWER6 is a microprocessor developed by IBM that implemented the Power ISA v.2.03. When it became available in systems in 2007, it succeeded the POWER5+ as IBM's flagship Power microprocessor...
- POWER7POWER7POWER7 is a Power Architecture microprocessor released in 2010 that succeeded the POWER6. POWER7 was developed by IBM at several sites including IBM's Rochester, MN; Austin, TX; Essex Junction, Vermont; T. J. Watson Research Center, NY; Bromont, QC and Böblingen, Germany laboratories...