TriMedia (Mediaprocessor)
Encyclopedia
TriMedia is a family of very long instruction word
media processors from NXP Semiconductors (formerly Philips
Semiconductors). TriMedia is a Harvard architecture
CPU
that features many DSP
and SIMD operations to efficiently process audio and video data streams. For TriMedia processor optimal performance can be achieved by only programming in C
/C++
as opposed to most other VLIW/DSP processors which require assembly language
programming to achieve optimal performance. High-level programmability of TriMedia relies on the large uniform register file
and the orthogonal instruction set
, in which RISC-like operations can be scheduled independently of each other in the VLIW issue slots. Furthermore, TriMedia processors boast advanced cache
s supporting unaligned accesses without performance penalty, hardware and software data/instruction prefetch
, allocate-on-write-miss, as well as collapsed load operations combining a traditional load with a 2-taps filter function. TriMedia development has been supported by various research studies
on hardware cache coherency
, multithreading and diverse accelerators to build scalable shared memory multiprocessor
systems.
under guidance of Gerrit Slavenburg, which resulted in 1996 in the introduction of the first Trimedia product: the TM1000 PCI Media Processor (introduced as TM-1 ). In 1998 the TM1100 and TM1300 (later re-named to PNX1300) products were introduced.
In 2000, Philips spun out its TriMedia business to TriMedia Technologies Inc. (TTI) which was a pure “IP vendor”. TTI tried unsuccessfully to create a 64 bit next generation TriMedia CPU architecture. In hindsight, the timing for this venture couldn't have been worse as it was right at the start of the Dot-com recession
. In 2003 what was left of TTI was re-absorbed within Philips.
In 2002, the TM3260 CPU was released in the PNX1500 Media Processor SoC
. This CPU was the first of a family of modular Trimedia CPU cores with standardized interfaces that can easily be integrated inside Audio/Video SoC's. The TM3260 has found use in other NXP products, e.g. the PNX8550 Home Entertainment Engine. In 2004 the super pipelined TM5250 CPU core was announced and won the Best Media Processor of 2003 award from Microprocessor Report. This processor was made available in the PNX1700 Media Processor SoC. These two CPU cores were designed by Jan-Willem van de Waerdt in close cooperation with Gerrit Slavenburg.
In 2005, the TM3270 was announced as a low-power
H.264 capable incarnation of the TriMedia architecture (see external links to papers below), first released in the PNX4103 SoC .
In 2006, Philips Semiconductors as a whole was spun off from Philips and was renamed to NXP in which the TriMedia architecture continues to thrive under the guidance of Jan-Willem van de Waerdt.
In 2009, the PNX1005 became available, using the latest TM3282 CPU core. This CPU is the first TriMedia to have 8 issue slots. It also adds more operations for H264 and video optimization, data-cache pre-allocating and a bit-stream coprocessor for entropy en/de-coding. It is also the first TriMedia to have a real-time trace block.
In 2010, the TriMedia group at NXP was terminated.
The TriMedia cores are typically used in home, mobile and automotive products (see as an example Philips' Streamium
networked multi-media products). They are used as deeply embedded CPUs in SoCs as well as general purpose DSPs. Some SoCs even have multiple TriMedia cores such as the PNX5100 that contains three TM3271 cores.
Very long instruction word
Very long instruction word or VLIW refers to a CPU architecture designed to take advantage of instruction level parallelism . A processor that executes every instruction one after the other may use processor resources inefficiently, potentially leading to poor performance...
media processors from NXP Semiconductors (formerly Philips
Philips
Koninklijke Philips Electronics N.V. , more commonly known as Philips, is a multinational Dutch electronics company....
Semiconductors). TriMedia is a Harvard architecture
Harvard architecture
The Harvard architecture is a computer architecture with physically separate storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape and data in electro-mechanical counters...
CPU
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...
that features many DSP
Digital signal processor
A digital signal processor is a specialized microprocessor with an architecture optimized for the fast operational needs of digital signal processing.-Typical characteristics:...
and SIMD operations to efficiently process audio and video data streams. For TriMedia processor optimal performance can be achieved by only programming in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
/C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
as opposed to most other VLIW/DSP processors which require assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...
programming to achieve optimal performance. High-level programmability of TriMedia relies on the large uniform register file
Register file
A register file is an array of processor registers in a central processing unit . Modern integrated circuit-based register files are usually implemented by way of fast static RAMs with multiple ports...
and the orthogonal instruction set
Orthogonal instruction set
Orthogonal instruction set is a term used in computer engineering. A computer's instruction set is said to be orthogonal if any instruction can use data of any type via any addressing mode...
, in which RISC-like operations can be scheduled independently of each other in the VLIW issue slots. Furthermore, TriMedia processors boast advanced cache
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...
s supporting unaligned accesses without performance penalty, hardware and software data/instruction prefetch
Instruction prefetch
In computer architecture, instruction prefetch is a technique used in microprocessors to speed up the execution of a program by reducing wait states....
, allocate-on-write-miss, as well as collapsed load operations combining a traditional load with a 2-taps filter function. TriMedia development has been supported by various research studies
Ne-XVP
Ne-XVP was a research project executed between 2006-2008 at NXP Semiconductors. The project undertook a holistic approach to define a next generation multimedia processing architecture for embedded MPSoCs that targets programmability, performance scalability, and silicon efficiency in an...
on hardware cache coherency
Cache coherency
In computing, cache coherence refers to the consistency of data stored in local caches of a shared resource.When clients in a system maintain caches of a common memory resource, problems may arise with inconsistent data. This is particularly true of CPUs in a multiprocessing system...
, multithreading and diverse accelerators to build scalable shared memory multiprocessor
Multiprocessing
Multiprocessing is the use of two or more central processing units within a single computer system. The term also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them...
systems.
Features
- 5 to 8 issue slots filled with up to 45 functional unitsExecution unitIn computer engineering, an execution unit is a part of a CPU that performs the operations and calculations called for by the Branch Unit, which receives data from the CPU...
- 128 32-bit general purpose registersProcessor registerIn computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...
- SIMD & DSP operations
- 32-bit IEEE 754IEEE floating-point standardIEEE 754–1985 was an industry standard for representingfloating-pointnumbers in computers, officially adopted in 1985 and superseded in 2008 byIEEE 754-2008. During its 23 years, it was the most widely used format for...
floating pointFloating pointIn computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...
operations - 8/16/32/64 KBKilobyteThe kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...
Instruction cacheCacheIn computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...
, 8/16/32/64/128 KB Data cache - separate memory and peripheralPeripheralA peripheral is a device attached to a host computer, but not part of it, and is more or less dependent on the host. It expands the host's capabilities, but does not form part of the core computer architecture....
bus interfaces - up to 8 built-in timers
- up to 64 built-in vectored interruptInterruptIn computing, an interrupt is an asynchronous signal indicating the need for attention or a synchronous event in software indicating the need for a change in execution....
s - supported by an ANSIAnsiAnsi is a village in Kaarma Parish, Saare County, on the island of Saaremaa, Estonia....
compliant CC (programming language)C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
/C++C++C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
compilerCompilerA compiler is a computer program that transforms source code written in a programming language into another computer language...
toolchainToolchainIn software, a toolchain is the set of programming tools that are used to create a product...
History
The first TriMedia was created in 1987 under the name LIFE-1 VLIW processor by Gerrit Slavenburg and Junien Labrousse. For the next several years LIFE was further matured internally in PhilipsPhilips
Koninklijke Philips Electronics N.V. , more commonly known as Philips, is a multinational Dutch electronics company....
under guidance of Gerrit Slavenburg, which resulted in 1996 in the introduction of the first Trimedia product: the TM1000 PCI Media Processor (introduced as TM-1 ). In 1998 the TM1100 and TM1300 (later re-named to PNX1300) products were introduced.
In 2000, Philips spun out its TriMedia business to TriMedia Technologies Inc. (TTI) which was a pure “IP vendor”. TTI tried unsuccessfully to create a 64 bit next generation TriMedia CPU architecture. In hindsight, the timing for this venture couldn't have been worse as it was right at the start of the Dot-com recession
Dot-com bubble
The dot-com bubble was a speculative bubble covering roughly 1995–2000 during which stock markets in industrialized nations saw their equity value rise rapidly from growth in the more...
. In 2003 what was left of TTI was re-absorbed within Philips.
In 2002, the TM3260 CPU was released in the PNX1500 Media Processor SoC
System-on-a-chip
A system on a chip or system on chip is an integrated circuit that integrates all components of a computer or other electronic system into a single chip. It may contain digital, analog, mixed-signal, and often radio-frequency functions—all on a single chip substrate...
. This CPU was the first of a family of modular Trimedia CPU cores with standardized interfaces that can easily be integrated inside Audio/Video SoC's. The TM3260 has found use in other NXP products, e.g. the PNX8550 Home Entertainment Engine. In 2004 the super pipelined TM5250 CPU core was announced and won the Best Media Processor of 2003 award from Microprocessor Report. This processor was made available in the PNX1700 Media Processor SoC. These two CPU cores were designed by Jan-Willem van de Waerdt in close cooperation with Gerrit Slavenburg.
In 2005, the TM3270 was announced as a low-power
Low-power
In electronics, the term low-power may mean:* Low-power broadcasting, that the power of the broadcast is less, i.e. the radio waves are not intended to travel as far as from typical transmitters....
H.264 capable incarnation of the TriMedia architecture (see external links to papers below), first released in the PNX4103 SoC .
In 2006, Philips Semiconductors as a whole was spun off from Philips and was renamed to NXP in which the TriMedia architecture continues to thrive under the guidance of Jan-Willem van de Waerdt.
In 2009, the PNX1005 became available, using the latest TM3282 CPU core. This CPU is the first TriMedia to have 8 issue slots. It also adds more operations for H264 and video optimization, data-cache pre-allocating and a bit-stream coprocessor for entropy en/de-coding. It is also the first TriMedia to have a real-time trace block.
In 2010, the TriMedia group at NXP was terminated.
Cores
Core | year 1st silicon |
ISA | Features | Cache (I/D) KB | frequency (worst case) |
introduction technology |
---|---|---|---|---|---|---|
TM1000 | 1997 | TMA0 | 32/16 | 100 MHz | 500 nm | |
TM1100 | 1998 | TMA1 | 32/16 | 133 MHz | 350 nm | |
TM1300 | 1999 | TMA1 | 32/16 | 166 MHz | 250 nm | |
TM3260 | 2002 | TMA2 | binary compatible with TM1300 | 64/16 | 250 MHz | 130 nm |
TM5250 | 2004 | TMA3 | 128 KB L2 data cache, allocate on write miss, hardware prefetching, super pipelined (high speed) |
64/16 | 450 MHz | 130 nm |
TM2270 | 2006 | TMA3 | 96 GPRs (small area) | 32/16 | 290 MHz | 90 nm |
TM3270/1 | 2006 | TMA4 + ASE | low power | 64/128 64/32 32/16 |
350 MHz | 90 nm |
TM3282 | 2009 | TMA4 + ASE2 | 8-issue, pre-allocate, bit-stream coprocessor, real-time trace | 64/128 | 400 MHz | 90 nm |
TM? | 2009 | TMA4 + ASE2? | 8-issue? | ?/? | 500 MHz | 45 nm |
The TriMedia cores are typically used in home, mobile and automotive products (see as an example Philips' Streamium
Streamium
Streamium is a line of IP-enabled entertainment products by Dutch electronics multi-national Philips Consumer Electronics. Streamium products allow users to consume digital entertainment and multimedia content—whether stored on their desktop computers or delivered through Internet-based services—on...
networked multi-media products). They are used as deeply embedded CPUs in SoCs as well as general purpose DSPs. Some SoCs even have multiple TriMedia cores such as the PNX5100 that contains three TM3271 cores.
External links
- http://www.philips.com
- http://www.tridentmicro.com/
- http://www.nxp.com
- Trident’s acquisition of NXP’s television systems and set-top box business lines
- paper: The TM3270 Media-Processor (pdf for IEEE members only)
- paper: The TM3270 Media-Processor (open to anyone)
- paper: The TM3270 Media-Processor Data Cache (pdf)
- paper: Motion Estimation Performance of the TM3270 Processor (pdf)
- EEMBC Consumer benchmark scores-- EEMBCEEMBCEEMBC, the Embedded Microprocessor Benchmark Consortium, is a non-profit organization formed in 1997 with the aim of developing meaningful performance benchmarks for the hardware and software used in embedded systems...
, the Embedded Microprocessor Benchmark Consortium, is a non-profit corporation - Press release for TM3282 EEMBC benchmark scores
- BDTi Video encoder and decoder benchmarks