TMS320C4x
Encyclopedia
The TMS320C4x is the second generation of 32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 floating point
Floating point
In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...

 digital signal processor
Digital signal processor
A digital signal processor is a specialized microprocessor with an architecture optimized for the fast operational needs of digital signal processing.-Typical characteristics:...

s. The first family member, the TMS320C40, was introduced in 1990. TMS320C4x family members target multiprocessor
Multiprocessing
Multiprocessing is the use of two or more central processing units within a single computer system. The term also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them...

 floating-point DSP systems for scientific, industrial, and military applications. The TMS320C4x is similar to (and object-code
Object file
An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....

 compatible with) its predecessor, TMS320C3x.

Key features of the TMS320C4x

The TMS320C4x has several key features:
  • IEEE
    Institute of Electrical and Electronics Engineers
    The Institute of Electrical and Electronics Engineers is a non-profit professional association headquartered in New York City that is dedicated to advancing technological innovation and excellence...

     floating-point conversion for ease of use
  • Register-based CPU
    Central processing unit
    The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

  • Single-cycle byte and half-word manipulation capabilities
  • Divide and square root
    Square root
    In mathematics, a square root of a number x is a number r such that r2 = x, or, in other words, a number r whose square is x...

     support for improved performance
  • On-chip memory includes 2K
    Kilobyte
    The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...

     words of SRAM, 128 words of program cache
    Cache
    In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

    , and boot loader
  • Two external buses providing an address reach of up to 4 gigawords
  • Two memory-mapped 32-bit timers
  • 6 and 12 channel DMA
    Direct memory access
    Direct memory access is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the central processing unit ....

  • Up to six communication ports for multiprocessor communication
  • Idle mode for reduced power consumption

Central Processing Unit (CPU)

The ’C4x’s CPU has a register-based architecture
Computer architecture
In computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modelling of those systems....

. The CPU consists of several components:

Floating-point/integer multiplier- The multiplier performs single-cycle multiplications on 32-bit integer and 40-bit floating-point values. The ’C4x implementation of floating-point arithmetic allows for floating-point operations at fixed point
Fixed-point arithmetic
In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...

 speeds via a 25-ns instruction cycle and a high degree of parallelism
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

.

Arithmetic Logic Unit (ALU)- The ALU performs single-cycle operations on 32-bit integer, 32-bit logical, and 40-bit floating-point data, including single-cycle integer and floating-point conversions. Results of the ALU are always maintained in 32-bit integer or 40-bit floating-point formats.

32-bit barrel shifter- The barrel shifter is coupled to the ALU and can perform shifts of up to 32 bits left or right. The shifter supports arithmetic shifts, logical shifts, and rotate-through-carry operations.

Internal buses (CPU1/CPU2 and REG1/REG2)- Four internal buses, CPU1, CPU2, REG1, and REG2, carry two operands from memory and two operands from the register file, thus allowing parallel multiplies and adds/subtracts on four integer or floating-point operands in a single cycle.

Auxiliary register arithmetic units (ARAU)- The two auxiliary register arithmetic units (ARAU0 and ARAU1) can generate two addresses in a single cycle. The ARAUs operate in parallel with the multiplier and ALU. They support addressing with displacements, index registers (IR0 and IR1), and circular and bit-reversed addressing.

CPU Primary register file- The ’C4x primary register file provides 32 registers in a multiport register file that is tightly coupled to the CPU. All of the primary register file registers can be operated upon by the multiplier and ALU and can be used as general-purpose registers.

CPU Expansion Register File- Besides the CPU primary register file, the expansion register file contains two special registers that act as pointers:
  1. The IVTP register points to the interrupt-vector table (IVT), which defines vectors for all interrupts.
  2. The TVTP register points to the trap vector table (TVT), which defines vectors for 512 traps.

Memory organization

The total memory reach of the ’C4x is 4G 32-bit words. Program memory (on chip RAM or ROM and external memory) as well as registers affecting timers,communication ports, and DMA channels are contained within this space. This allows tables, coefficients, program code, and data to be stored in either RAM or ROM. Thus, memory usage is maximized, and memory space allocated as
desired.

Memory Map- The memory map for each processor is shown in Figure. The level at the external pin ROMEN determines whether or not the first megaword of memory addresses the internal ROM or external memory. The maps illustrate the entire address space of the ’C40 and ’C44. The value of ROMEN affects only the first megaword of memory:

Memory Addressing Modes- The ’C4x supports a base set of general-purpose instructions as well as arithmetic- intensive instructions that are particularly suited for digital signal processing
and other numeric-intensive applications.

The following list shows the addressing modes with their addressing types:
  • General addressing modes:
    • Register. The operand is a CPU register.
    • Immediate. The operand is a 16-bit immediate value.
    • Direct. The operand is the contents of a 32-bit address
    • Indirect. A 32-bit auxiliary register indicates the address of the operand.

  • Three-operand addressing modes:
    • Register. (same as for general addressing mode).
    • Indirect. (same as for general addressing mode).
    • Immediate. The operand is an 8-bit immediate value.

  • Parallel addressing modes:
    • Register. The operand is an extended-precision register.
    • Indirect. (same as for general addressing mode).

  • Branch addressing modes:
    • Register. (same as for general addressing mode).
    • PC-relative. A signed 16-bit displacement or a 24-bit displacement is added to the PC.

Internal buses

A large portion of the ’C4x’s high performance is due to internal busing and parallelism.
Separate buses allow for parallel program fetching, data accessing, and DMA accessing:
  • Program buses- PADDR and PDATA
  • Data buses- DADDR1, DADDR2, and DDATA
  • DMA buses- DMAADDR and DMADATA

External bus operation

The ’C4x provides two identical external interfaces: the global memory interface and the local memory interface. Each consists of a 32-bit data bus, a 31-bit (’C40) or 24-bit (’C44) address bus, and two sets of control signals. Both buses can be used to address external program/data memory or I/O space.

Interrupts

The ’C4x supports four external interrupts (IIOF3–0), a number of internal interrupts, a non-maskable external NMI interrupt, and a non-maskable external RESET signal, which sets the processor to a known state. The DMA and communication ports have their own internal interrupts. When the CPU responds to the interrupt, the IACK pin can be used to signal an external interrupt acknowledge.

Peripherals

All ’C4x on-chip peripherals are controlled through memory-mapped registers on a dedicated peripheral bus. This peripheral bus is composed of a 32-bit data bus and a 32-bit address bus. The ’C4x peripherals include two timers and six (’C40) or four (’C44) communication ports.

Pipeline operation

Two characteristics of the ’C4x that contribute to its high performance are pipelining and concurrent I/O and CPU operation. Four functional units control ’C4x pipeline operation: fetch, decode, read, and execute. Pipelining is the overlapping or parallel operations of the fetch, decode, read, and execute levels of a basic instruction.

The four major units of the ’C4x pipeline structure and their functions are as follows:
  • Fetch Unit (F)- Fetches the instruction words from memory and updates the program counter.
  • Decode Unit (D)- Decodes the instruction word and performs address generation. Also, controls modification of the ARn registers in the indirect addressing mode, and of the stack pointer when PUSH to or POP from the stack occurs.
  • Read Unit (R) If required, reads the operands from memory.
  • Execute Unit (E) If required, reads the operands from the register file, performs the necessary operation, and writes results to the register file. If required, results of previous operations are written to memory.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK