Modified Harvard architecture

The Modified Harvard Architecture is a variation of the Harvard computer architecture

Harvard architecture

The Harvard architecture is a computer architecture with physically separate storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape and data in electro-mechanical counters...

that allows the contents of the instruction memory to be accessed as if it were data. Most modern computers that are documented as Harvard Architecture are, in fact, Modified Harvard Architecture.

Harvard Architecture

The original Harvard architecture computer, the Harvard Mark I

Harvard Mark I

The IBM Automatic Sequence Controlled Calculator , called the Mark I by Harvard University, was an electro-mechanical computer....

, employed entirely separate memory systems to store instructions and data. The CPU fetched the next instruction and loaded or stored data simultaneously and independently. This is in contrast to a Von Neumann architecture

Von Neumann architecture

The term Von Neumann architecture, aka the Von Neumann model, derives from a computer architecture proposal by the mathematician and early computer scientist John von Neumann and others, dated June 30, 1945, entitled First Draft of a Report on the EDVAC...

computer, in which both instructions and data are stored in the same memory system and (without the complexity of a cache

Cache

In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

) must be accessed in turn. The physical separation of instruction and data memory is sometimes held to be the distinguishing feature of modern Harvard architecture computers. With microcontroller

Microcontroller

A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

s (entire computer systems integrated onto single chips), the use of different memory technologies for instructions (e.g. flash memory

Flash memory

Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...

) and data (typically read/write memory

Read-write memory

Read-write memory is a type of computer memory that may be relatively easily written to as well as read from . The term RAM is often used to describe writable memory. RAM actually referring to memory that can be accessed at any "location"....

) in von Neumann machines is becoming popular. The true distinction of a Harvard machine is that instruction and data memory occupy different address space

Address space

In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.- Overview :...

s. In other words, a memory address does not uniquely identify a storage location (as it does in a Von Neumann machine); you also need to know the memory space (instruction or data) to which the address belongs.

A computer with a Von Neumann architecture has the advantage over pure Harvard machines in that code can also be accessed and treated the same as data, and vice versa. This allows for example reading data from disk storage

Disk storage

Disk storage or disc storage is a general category of storage mechanisms, in which data are digitally recorded by various electronic, magnetic, optical, or mechanical methods on a surface layer deposited of one or more planar, round and rotating disks...

and executing it as code; or self-optimizing software systems using technologies such as just-in-time compilation

Just-in-time compilation

In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

to write machine code into their own memory and then later execute it. Another example is Self-modifying code

Self-modifying code

In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...

, which allows a program to modify itself. A disadvantage of these methods are issues with executable space protection

Executable space protection

In computer security, executable space protection is the marking of memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception...

, which increase the risks from malware

Malware

Malware, short for malicious software, consists of programming that is designed to disrupt or deny operation, gather information that leads to loss of privacy or exploitation, or gain unauthorized access to system resources, or that otherwise exhibits abusive behavior...

and software defects. In addition, in these systems it is notoriously difficult to document code flow, and also can make debugging much more difficult.

Accordingly, some pure Harvard machines are specialty products. Most modern computers instead implement a modified Harvard architecture. Those modifications are various ways to loosen the strict separation between code and data, while still supporting the higher performance concurrent data and instruction access of the Harvard architecture.

The most common modification builds a memory hierarchy

Memory hierarchy

The term memory hierarchy is used in the theory of computation when discussing performance issues in computer architectural design, algorithm predictions, and the lower level programming constructs such as involving locality of reference. A 'memory hierarchy' in computer storage distinguishes each...

with a CPU cache

CPU cache

A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

separating instruction and data. This unifies all except small portions of the data and instruction address spaces, providing the von Neumann model. Most programmers never need to be aware of the fact that the processor core implements a (modified) Harvard architecture, although they benefit from its speed advantages. Only programmers who write instructions into data memory need to be aware of issues such as cache coherency

Cache coherency

In computing, cache coherence refers to the consistency of data stored in local caches of a shared resource.When clients in a system maintain caches of a common memory resource, problems may arise with inconsistent data. This is particularly true of CPUs in a multiprocessing system...

and executable space protection

Executable space protection

In computer security, executable space protection is the marking of memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception...

.

Another change preserves the "separate address space" nature of a Harvard machine, just providing special machine operations to access the contents of the instruction memory as data. Because data is not directly executable as instructions, such machines are not always viewed as "modified" Harvard architecture:

Read access ... initial data values can be copied from the instruction memory into the data memory when the program starts. Or, if the data is not to be modified (it might be a constant value, such as pi
Pi
' is a mathematical constant that is the ratio of any circle's circumference to its diameter. is approximately equal to 3.14. Many formulae in mathematics, science, and engineering involve , which makes it one of the most important mathematical constants...

, or a text string
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

), it can be accessed by the running program directly from instruction memory without taking up space in data memory (which is often at a premium).
Write access ... a capability for reprogramming is generally required; few computers are purely ROM
Read-only memory
Read-only memory is a class of storage medium used in computers and other electronic devices. Data stored in ROM cannot be modified, or can be modified only slowly or with difficulty, so it is mainly used to distribute firmware .In its strictest sense, ROM refers only...

based. For example, a microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

usually has operations to write to the flash memory
Flash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...

used to hold its instructions. This capability may be used for purposes including software updates and EEPROM
EEPROM
EEPROM stands for Electrically Erasable Programmable Read-Only Memory and is a type of non-volatile memory used in computers and other electronic devices to store small amounts of data that must be saved when power is removed, e.g., calibration...

replacement.

Comparisons

Three characteristics may be used to distinguish Modified Harvard machines from Harvard and Von Neumann machines:

Instruction and data memories occupy different address spaces. For pure Harvard machines, there is an address 'zero' in instruction space that refers to an instruction storage location and a separate address 'zero' in data space that refers to a distinct data storage location. By contrast, Von Neumann and modified Harvard machines store both instructions and data in a single address space, so address 'zero' refers to only one thing and whether the binary pattern in that location is interpreted as an instruction or data is defined by how the program is written. This characteristic unambiguously identifies a pure Harvard machine.
- By a strict interpretation of this distinction, for example, the Microchip PIC17 and PIC18 architectures, as well as the Atmel 8-bit AVR architecture, would be regarded as pure Harvard Architecture machines because they do, in fact, maintain a distinct separation between code and data spaces, and address 'zero' of each does, in fact, refer to a physically different piece of memory. However, the distinction is made ambiguous by the colloquial use of the term "modified Harvard Architecture" to refer to such machines' inclusion of special instructions to read and/or write the contents of code space as though it were data.

Instruction and data memories have separate hardware pathways to the central processing unit (CPU). This is the point of pure or modified Harvard machines, and why they co-exist with the more flexible and general von Neumann architecture: separate memory pathways to the CPU allow instructions to be fetched and data to be accessed at the same time, improving throughput. The pure Harvard machines have separate pathways with separate address spaces. Modified Harvard machines have such separate access paths for CPU caches or other tightly coupled memories, but a unified address space covers the rest of the memory hierarchy
Memory hierarchy
The term memory hierarchy is used in the theory of computation when discussing performance issues in computer architectural design, algorithm predictions, and the lower level programming constructs such as involving locality of reference. A 'memory hierarchy' in computer storage distinguishes each...

. A Von Neumann processor has only that unified address space. From a programmer's point-of-view, a modified Harvard processor is usually treated as a Von Neumann machine until cache coherency
Cache coherency
In computing, cache coherence refers to the consistency of data stored in local caches of a shared resource.When clients in a system maintain caches of a common memory resource, problems may arise with inconsistent data. This is particularly true of CPUs in a multiprocessing system...

becomes an issue, as with self-modifying code
Self-modifying code
In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...

and program loading. This can be confusing, but such issues are usually visible only to systems programmers
System programming
System programming is the activity of programming system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims to produce software which provides services to the user System programming (or systems...

and integrators
System integration
In engineering, system integration is the bringing together of the component subsystems into one system and ensuring that the subsystems function together as a system...

.

Instruction and data memories may be accessed in different ways. The original Harvard machine, the Mark I
Harvard Mark I
The IBM Automatic Sequence Controlled Calculator , called the Mark I by Harvard University, was an electro-mechanical computer....

, stored instructions on a punched paper tape
Punched tape
Punched tape or paper tape is an obsolete form of data storage, consisting of a long strip of paper in which holes are punched to store data...

and data in electro-mechanical counters. This, however, was entirely due to the limitations of technology available at the time. Today a Harvard machine such as the PIC microcontroller
PIC microcontroller
PIC is a family of Harvard architecture microcontrollers made by Microchip Technology, derived from the PIC1650 originally developed by General Instrument's Microelectronics Division...

might use 12-bit wide flash memory
Flash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...

for instructions, and 8-bit wide SRAM
Static random access memory
Static random-access memory is a type of semiconductor memory where the word static indicates that, unlike dynamic RAM , it does not need to be periodically refreshed, as SRAM uses bistable latching circuitry to store each bit...

for data. In contrast, a Von Neumann microcontroller such as an ARM7TDMI
ARM7TDMI
ARM7 is a generation of ARM processor designs. This generation introduced the Thumb 16-bit instruction set providing improved code density compared to previous designs. The most widely used ARM7 designs implement the ARMv4T architecture, but some implement ARMv3 or ARMv5TEJ...

, or a modified Harvard ARM9 core, necessarily provides uniform access to flash and SRAM (as 8 bit bytes, in those cases).

Modern uses of the Modified Harvard architecture

Outside of applications where a cacheless DSP or microcontroller

Microcontroller

is required, most modern processors have a CPU cache

CPU cache

which partitions instruction and data. Accordingly, they are hybrids of the Harvard and von Neumann models, and are best viewed as implementing a Modified Harvard Architecture. Examples include the x86 processors found in most desktop computers, and ARM

ARM architecture

ARM is a 32-bit reduced instruction set computer instruction set architecture developed by ARM Holdings. It was named the Advanced RISC Machine, and before that, the Acorn RISC Machine. The ARM architecture is the most widely used 32-bit ISA in numbers produced...

cores embedded as applications processors in cell phones. MIPS

MIPS architecture

MIPS is a reduced instruction set computer instruction set architecture developed by MIPS Technologies . The early MIPS architectures were 32-bit, and later versions were 64-bit...

, Blackfin

Blackfin

The Blackfin is a family of 16- or 32-bit microprocessors developed, manufactured and marketed by Analog Devices. The family is characterized by their built-in, fixed-point digital signal processor functionality supplied by 16-bit Multiply–accumulates , accompanied on-chip by a small and...

, PowerPC

PowerPC

PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

, and many other processor families implement this flavor of Modified Harvard Architecture.

There are also processors which are Harvard machines by the most rigorous definition (that program and data memory occupy different address spaces), and are only modified in the weak sense that there are operations to read and/or write program memory as data. For example, LPM (Load Program Memory) and SPM (Store Program Memory) instructions in the Atmel AVR

Atmel AVR

The AVR is a modified Harvard architecture 8-bit RISC single chip microcontroller which was developed by Atmel in 1996. The AVR was one of the first microcontroller families to use on-chip flash memory for program storage, as opposed to one-time programmable ROM, EPROM, or EEPROM used by other...

implement such a modification. Similar solutions are found in other microcontrollers such as the PIC and Z8Encore!

Z8 Encore!

The Zilog Z8 Encore! is a microcontroller based on the popular Z8 microcontroller.The Z8 Encore! offers a wide range of features for use in embedded applications....

, many families of digital signal processors such as the TI C55x cores

Texas Instruments TMS320

Texas Instruments TMS320 is a blanket name for a series of digital signal processors from Texas Instruments. It was introduced on April 8, 1983 through the TMS32010 processor, which was then the fastest DSP on the market....

, and more. Because instruction execution is still restricted to the program address space, these processors are very unlike von Neumann machines.

Having separate address spaces creates certain difficulties in programming with high-level languages such as C, which don't directly support the notion that tables of read-only data might be in a different address space than normal writable data (and thus need to be read using different instructions).

Harvard Architecture