
Computer performance
    
    Encyclopedia
    
        Computer performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.
Depending on the context, good computer performance may involve one or more of the following:
include availability
, response time, channel capacity
, latency
, completion time, service time, bandwidth
, throughput
, relative efficiency, scalability
, performance per watt
, compression ratio
, instruction path length
and speed up. CPU benchmarks are available.
performance, particularly software application response time, is an aspect of software quality
that is important in human–computer interaction
s.
- compared relative to other systems or the same system before/after changes
- defined in absolute terms, e.g. for fulfilling a contractual obligation
Whilst the above definition relates to a scientific, technical approach, the following definition given by Arnold Allen
would be useful for a non-technical audience:
Because there are too many programs to test a CPU's speed on all of them, benchmark
s were developed.
The most famous benchmarks are the SPECint and SPECfp
benchmarks developed by Standard Performance Evaluation Corporation
and the ConsumerMark benchmark developed by the Embedded Microprocessor Benchmark Consortium EEMBC
.
Some important measurements include:
Occasionally a CPU design
er can find a way to make a CPU with better overall performance by improving one of these technical performance metrics without sacrificing any other (relevant) technical performance metric—for example, building the CPU out of better, faster transistors.
However, sometimes pushing one technical performance metric to an extreme leads to a CPU with worse overall performance, because other important technical performance metrics were sacrificed to get one impressive-looking number—for example, the megahertz myth.
The total amount of time (t) required to execute a particular benchmark program is
where
Even on one machine, a different compiler or the same compiler with different compiler optimization
switches can change N and CPI—the benchmark executes faster if the new compiler can improve N or C without making the other worse, but often there is a trade-off between them—is it better, for example, to use a few complicated instructions that take a long time to execute, or to use instructions that execute very quickly, although it takes more of them to execute the benchmark?
A CPU designer is often required to implement a particular instruction set
, and so cannot change N.
Sometimes a designer focuses on improving performance by making significant improvements in f (with techniques such as deeper pipelines and faster caches), while (hopefully) not sacrificing too much C—leading to a speed-demon CPU design.
Sometimes a designer focuses on improving performance by making significant improvements in CPI (with techniques such as out-of-order execution, superscalar
CPUs, larger caches, caches with improved hit rates, improved branch prediction, speculative execution, etc.), while (hopefully) not sacrificing too much clock frequency—leading to a brainiac CPU design.
Depending on the context, good computer performance may involve one or more of the following:
- Short response time for a given piece of work
-  High throughputThroughputIn communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
 (rate of processing work)
- Low utilization of computing resource(s)
-  High availabilityHigh availabilityHigh availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period....
 of the computing system or application
-  Fast (or highly compact) data compressionData compressionIn computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....
 and decompression
-  High bandwidthBandwidth (computing)In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...
 / short data transmissionData transmissionData transmission, digital transmission, or digital communications is the physical transfer of data over a point-to-point or point-to-multipoint communication channel. Examples of such channels are copper wires, optical fibres, wireless communication channels, and storage media...
 time
Performance metrics
Computer performance metricsSoftware metric
A software metric is a measure of some property of a piece of software or its specifications. Since quantitative measurements are essential in all sciences, there is a continuous effort by computer science practitioners and theoreticians to bring similar approaches to software development...
include availability
Availability
In telecommunications and reliability theory, the term availability has the following meanings:* The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time...
, response time, channel capacity
Channel capacity
In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel...
, latency
Latency (engineering)
Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.-Packet-switched networks:...
, completion time, service time, bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...
, throughput
Throughput
In communication networks, such as Ethernet or packet radio, throughput or network throughput is the average rate of successful message delivery over a communication channel. This data may be delivered over a physical or logical link, or pass through a certain network node...
, relative efficiency, scalability
Scalability
In electronics  scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...
, performance per watt
Performance per watt
In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed....
, compression ratio
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....
, instruction path length
Instruction path length
In computer performance, the instruction path length is the number of machine code instructions required to execute a section of a computer program. The total path length for the entire program could be deemed a measure of the algorithm's performance on a particular computer hardware...
and speed up. CPU benchmarks are available.
Aspect of software quality
Computer softwareComputer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....
performance, particularly software application response time, is an aspect of software quality
Software quality
In the context of software engineering, software quality refers to two related but distinct notions that exist wherever quality is defined in a business context:...
that is important in human–computer interaction
Human–computer interaction
Human–computer Interaction  is the study, planning, and design of the interaction between people  and computers. It is often regarded as the intersection of computer science, behavioral sciences, design and several other fields of study...
s.
Technical and non-technical definitions
The performance of any computer system can be evaluated in measurable, technical terms, using one or more of the metrics listed above. This way the performance can be- compared relative to other systems or the same system before/after changes
- defined in absolute terms, e.g. for fulfilling a contractual obligation
Whilst the above definition relates to a scientific, technical approach, the following definition given by Arnold Allen
Arnold Allen
Arnold Oral Allen was an American instructor, public speaker, and writer who worked at IBM and Hewlett-Packard, and specialized in the analysis and mathematical modelling of computer performance.-Biography:...
would be useful for a non-technical audience:
The word performance in computer performance means the same thing that performance means in other contexts, that is, it means "How well is the computer doing the work it is supposed to do?"
Technical performance metrics
There are a wide variety of technical performance metrics that indirectly affect overall computer performance.Because there are too many programs to test a CPU's speed on all of them, benchmark
Benchmark (computing)
In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it...
s were developed.
The most famous benchmarks are the SPECint and SPECfp
SPECfp
SPECfp is a computer benchmark designed to test the floating point performance of a computer.  It is managed by the Standard Performance Evaluation Corporation.  SPECfp is the floating point performance testing component of the SPEC CPU testing suit. The first stander SPECfp was released in 1989 as...
benchmarks developed by Standard Performance Evaluation Corporation
Standard Performance Evaluation Corporation
The Standard Performance Evaluation Corporation  is a non-profit organization that aims to "produce, establish, maintain and endorse a standardized set" of performance benchmarks for computers....
and the ConsumerMark benchmark developed by the Embedded Microprocessor Benchmark Consortium EEMBC
EEMBC
EEMBC, the Embedded Microprocessor Benchmark Consortium, is a non-profit organization formed in 1997 with the aim of developing meaningful performance benchmarks for the hardware and software used in embedded systems...
.
Some important measurements include:
-  Instructions per secondInstructions per secondInstructions per second is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads typically lead to significantly lower IPS values...
 – Most consumers pick a computer architecture (normally Intel IA32 architecture) to be able to run a large base of pre-existing, pre-compiled software. Being relatively uninformed on computer benchmarks, some of them pick a particular CPU based on operating frequency (see megahertz mythMegahertz MythThe megahertz myth, or less commonly the gigahertz myth, refers to the misconception of only using clock rate to compare the performance of different microprocessors...
 ).
-  FLOPSFLOPSIn computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...
 – The number of floating-point operations per second is often important in selecting computers for scientific computations.
-  Performance per wattPerformance per wattIn computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed....
 – System designers building parallel computersParallel computingParallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
 , such as Google, pick CPUs based on their speed per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself. http://www.eembc.org/benchmark/consumer.asp?HTYPE=SIMhttp://news.com.com/Power+could+cost+more+than+servers,+Google+warns/2100-1010_3-5988090.html
- Some system designers building parallel computers pick CPUs based on the speed per dollar.
-  System designers building real-time computingReal-time computingIn computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...
 systems want to guarantee worst-case response. That is easier to do when the CPU has low interrupt latencyInterrupt latencyIn real-time operating systems, interrupt latency is the time between the generation of an interrupt by a device and the servicing of the device which generated the interrupt. For many operating systems, devices are serviced as soon as the device's interrupt handler is executed...
 and when it has deterministic response. (DSPDSP- Computing :* Digital signal processing, the study and implementation of signals in digital computing and their processing methods* Digital signal processor, a specialized microprocessor designed specifically for digital signal processing...
 )
-  Computer programmers who program directly in assembly language want a CPU to support a full-featured instruction setInstruction setAn instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...
 .
- Low power – For systems with limited power sources (e.g. solar, batteries, human power).
- Small size or low weight - for portable embedded systems, systems for spacecraft.
-  Environmental impact – Minimizing environmental impact of computers during manufacturing and recycling as well as during use.  Reducing waste, reducing hazardous materials.  (see Green computingGreen computingGreen computing or green IT, refers to environmentally sustainable computing or IT. In the article Harnessing Green IT: Principles and Practices, San Murugesan defines the field of green computing as "the study and practice of designing, manufacturing, using, and disposing of computers, servers,...
 ).
-  Giga-updates per secondGiga-updates per secondGiga-updates per second is a measure of computer performance. GUPS is a measurement of how frequently a computer can issue updates to randomly generated RAM locations. GUPS measurements stress the latency and especially bandwidth capabilities of a machine.-External links:* , an HPC Challenge...
 - a measure of how frequently the RAM can be updated
Occasionally a CPU design
CPU design
CPU design is the design engineering task of creating a central processing unit , a component of computer hardware. It is a subfield of electronics engineering and computer engineering.- Overview :CPU design focuses on these areas:...
er can find a way to make a CPU with better overall performance by improving one of these technical performance metrics without sacrificing any other (relevant) technical performance metric—for example, building the CPU out of better, faster transistors.
However, sometimes pushing one technical performance metric to an extreme leads to a CPU with worse overall performance, because other important technical performance metrics were sacrificed to get one impressive-looking number—for example, the megahertz myth.
The total amount of time (t) required to execute a particular benchmark program is
where
-  N is the number of instructions actually executed (the instruction path lengthInstruction path lengthIn computer performance, the instruction path length is the number of machine code instructions required to execute a section of a computer program. The total path length for the entire program could be deemed a measure of the algorithm's performance on a particular computer hardware...
 ). The code density of the instruction setInstruction setAn instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...
 strongly affects N. The value of N can either be determined exactly by using an instruction set simulatorInstruction Set SimulatorAn instruction set simulator is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent the processor's registers.Instruction simulation is a...
 (if available) or by estimation—itself based partly on estimated or actual frequency distribution of input variables and by examining generated machine codeMachine codeMachine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...
 from an HLLHigh-level programming languageA high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of...
 compiler. It cannot be determined from the number of lines of HLL source code. N is not affected by other processes running on the same processor. The significant point here is that hardwareHardwareHardware is a general term for equipment such as keys, locks, hinges, latches, handles, wire, chains, plumbing supplies, tools, utensils, cutlery and machine parts. Household hardware is typically sold in hardware stores....
 normally does not keep track of (or at least make easily available) a value of N for executed programs. The value can therefore only be accurately determined by instruction set simulation, which is rarely practiced.
- f is the clock frequency in cycles per second.
-  C is the average cycles per instructionCycles Per InstructionIn computer architecture, cycles per instruction is a term used to describe one aspect of a processor's performance: the number of clock cycles that happen when an instruction is being executed...
 (CPI) for this benchmark.
Even on one machine, a different compiler or the same compiler with different compiler optimization
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program.  The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...
switches can change N and CPI—the benchmark executes faster if the new compiler can improve N or C without making the other worse, but often there is a trade-off between them—is it better, for example, to use a few complicated instructions that take a long time to execute, or to use instructions that execute very quickly, although it takes more of them to execute the benchmark?
A CPU designer is often required to implement a particular instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...
, and so cannot change N.
Sometimes a designer focuses on improving performance by making significant improvements in f (with techniques such as deeper pipelines and faster caches), while (hopefully) not sacrificing too much C—leading to a speed-demon CPU design.
Sometimes a designer focuses on improving performance by making significant improvements in CPI (with techniques such as out-of-order execution, superscalar
Superscalar
A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single processor. It therefore allows faster CPU throughput than would otherwise be possible at a given clock rate...
CPUs, larger caches, caches with improved hit rates, improved branch prediction, speculative execution, etc.), while (hopefully) not sacrificing too much clock frequency—leading to a brainiac CPU design.
See also
-  Algorithmic efficiencyAlgorithmic efficiencyIn computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process, where the goal is to reduce...
-  Benchmark (computing)Benchmark (computing)In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it...
-  Computer architectureComputer architectureIn computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modelling of those systems....
-  Network performanceNetwork performanceNetwork performance refers to the service quality of a telecommunications product as seen by the customer. It should not be seen merely as an attempt to get "more through" the network....
-  Optimization (computer science)Optimization (computer science)In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...
-  Perceived performancePerceived performancePerceived performance, in computer engineering, refers to how quickly a software feature appears to perform its task. The concept applies mainly to user acceptance aspects....
-  Performance analysisPerformance analysisIn software engineering, profiling is a form of dynamic program analysis that measures, for example, the usage of memory, the usage of particular instructions, or frequency and duration of function calls...
-  Performance engineeringPerformance EngineeringPerformance engineering within systems engineering, encompasses the set of roles, skills, activities, practices, tools, and deliverables applied at every phase of the Systems Development Life Cycle which ensures that a solution will be designed, implemented, and operationally supported to meet the...
-  Performance managementPerformance managementPerformance management includes activities that ensure that goals are consistently being met in an effective and efficient manner. Performance management can focus on the performance of an organization, a department, employee, or even the processes to build a product or service, as well as many...
-  Performance testingPerformance testingIn software engineering, performance testing is in general testing performed to determine how a system performs in terms of responsiveness and stability under a particular workload...
-  Performance tuningPerformance tuningPerformance tuning is the improvement of system performance. This is typically a computer application, but the same methods can be applied to economic markets, bureaucracies or other complex systems. The motivation for such activity is called a performance problem, which can be real or anticipated....



