Supercomputer
Overview
 

A supercomputer is a computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 at the frontline of current processing capacity, particularly speed of calculation.

Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting
Weather forecasting
Weather forecasting is the application of science and technology to predict the state of the atmosphere for a given location. Human beings have attempted to predict the weather informally for millennia, and formally since the nineteenth century...

, climate research, molecular modeling
Computational chemistry
Computational chemistry is a branch of chemistry that uses principles of computer science to assist in solving chemical problems. It uses the results of theoretical chemistry, incorporated into efficient computer programs, to calculate the structures and properties of molecules and solids...

 (computing the structures and properties of chemical compounds, biological macromolecules, polymers, and crystals), and physical simulations (such as simulation of airplanes in wind tunnel
Wind tunnel
A wind tunnel is a research tool used in aerodynamic research to study the effects of air moving past solid objects.-Theory of operation:Wind tunnels were first proposed as a means of studying vehicles in free flight...

s, simulation of the detonation of nuclear weapons, and research into nuclear fusion
Nuclear fusion
Nuclear fusion is the process by which two or more atomic nuclei join together, or "fuse", to form a single heavier nucleus. This is usually accompanied by the release or absorption of large quantities of energy...

).

Supercomputers were introduced in the 1960s and were designed primarily by Seymour Cray
Seymour Cray
Seymour Roger Cray was an American electrical engineer and supercomputer architect who designed a series of computers that were the fastest in the world for decades, and founded Cray Research which would build many of these machines. Called "the father of supercomputing," Cray has been credited...

 at Control Data Corporation
Control Data Corporation
Control Data Corporation was a supercomputer firm. For most of the 1960s, it built the fastest computers in the world by far, only losing that crown in the 1970s after Seymour Cray left the company to found Cray Research, Inc....

 (CDC), which led the market into the 1970s until Cray left to form his own company, Cray Research.
He then took over the supercomputer market with his new designs, holding the top spot in supercomputing for five years (1985–1990).
Encyclopedia

A supercomputer is a computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 at the frontline of current processing capacity, particularly speed of calculation.

Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting
Weather forecasting
Weather forecasting is the application of science and technology to predict the state of the atmosphere for a given location. Human beings have attempted to predict the weather informally for millennia, and formally since the nineteenth century...

, climate research, molecular modeling
Computational chemistry
Computational chemistry is a branch of chemistry that uses principles of computer science to assist in solving chemical problems. It uses the results of theoretical chemistry, incorporated into efficient computer programs, to calculate the structures and properties of molecules and solids...

 (computing the structures and properties of chemical compounds, biological macromolecules, polymers, and crystals), and physical simulations (such as simulation of airplanes in wind tunnel
Wind tunnel
A wind tunnel is a research tool used in aerodynamic research to study the effects of air moving past solid objects.-Theory of operation:Wind tunnels were first proposed as a means of studying vehicles in free flight...

s, simulation of the detonation of nuclear weapons, and research into nuclear fusion
Nuclear fusion
Nuclear fusion is the process by which two or more atomic nuclei join together, or "fuse", to form a single heavier nucleus. This is usually accompanied by the release or absorption of large quantities of energy...

).

Supercomputers were introduced in the 1960s and were designed primarily by Seymour Cray
Seymour Cray
Seymour Roger Cray was an American electrical engineer and supercomputer architect who designed a series of computers that were the fastest in the world for decades, and founded Cray Research which would build many of these machines. Called "the father of supercomputing," Cray has been credited...

 at Control Data Corporation
Control Data Corporation
Control Data Corporation was a supercomputer firm. For most of the 1960s, it built the fastest computers in the world by far, only losing that crown in the 1970s after Seymour Cray left the company to found Cray Research, Inc....

 (CDC), which led the market into the 1970s until Cray left to form his own company, Cray Research.
He then took over the supercomputer market with his new designs, holding the top spot in supercomputing for five years (1985–1990). In the 1980s a large number of smaller competitors entered the market, in parallel to the creation of the minicomputer
Minicomputer
A minicomputer is a class of multi-user computers that lies in the middle range of the computing spectrum, in between the largest multi-user systems and the smallest single-user systems...

 market a decade earlier, but many of these disappeared in the mid-1990s "supercomputer market crash".

Today, supercomputers are typically one-of-a-kind custom designs produced by traditional companies such as Cray
Cray
Cray Inc. is an American supercomputer manufacturer based in Seattle, Washington. The company's predecessor, Cray Research, Inc. , was founded in 1972 by computer designer Seymour Cray. Seymour Cray went on to form the spin-off Cray Computer Corporation , in 1989, which went bankrupt in 1995,...

, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 and Hewlett-Packard
Hewlett-Packard
Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...

, who had purchased many of the 1980s companies to gain their experience. Currently, Japan's K computer
K computer
The K computer – named for the Japanese word , which stands for 10 quadrillion – is a supercomputer being produced by Fujitsu at the RIKEN Advanced Institute for Computational Science campus in Kobe, Japan. In June 2011, TOP500 ranked K the world's fastest supercomputer, with a rating...

, built by Fujitsu
Fujitsu
is a Japanese multinational information technology equipment and services company headquartered in Tokyo, Japan. It is the world's third-largest IT services provider measured by revenues....

 in Kobe, Japan is the fastest in the world. It is three times faster than previous one to hold that title, the Tianhe-1A supercomputer located in China
China
Chinese civilization may refer to:* China for more general discussion of the country.* Chinese culture* Greater China, the transnational community of ethnic Chinese.* History of China* Sinosphere, the area historically affected by Chinese culture...

.

The term supercomputer itself is rather fluid, and the speed of earlier "supercomputers" tends to become typical of future ordinary computers. CDC's early machines were simply very fast scalar processor
Scalar processor
Scalar processors represent the simplest class of computer processors. A scalar processor processes one datum at a time . , a scalar processor is classified as a SISD processor .In a vector processor, by contrast, a single instruction operates simultaneously on multiple data items...

s, some ten times the speed of the fastest machines offered by other companies. In the 1970s most supercomputers were dedicated to running a vector processor
Vector processor
A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

, and many of the newer players developed their own such processors at a lower price to enter the market. The early and mid-1980s saw machines with a modest number of vector processors working in parallel to become the standard. Typical numbers of processors were in the range of four to sixteen. In the later 1980s and 1990s, attention turned from vector processors to massive parallel processing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

 systems with thousands of "ordinary" CPUs
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

, some being off the shelf units
Commercial off-the-shelf
In the United States, Commercially available Off-The-Shelf is a Federal Acquisition Regulation term defining a nondevelopmental item of supply that is both commercial and sold in substantial quantities in the commercial marketplace, and that can be procured or utilized under government contract...

 and others being custom designs (see Transputer by instance). Today, parallel designs are based on "off the shelf" server-class microprocessors, such as the PowerPC
PowerPC
PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

, Opteron
Opteron
Opteron is AMD's x86 server and workstation processor line, and was the first processor which supported the AMD64 instruction set architecture . It was released on April 22, 2003 with the SledgeHammer core and was intended to compete in the server and workstation markets, particularly in the same...

, or Xeon
Xeon
The Xeon is a brand of multiprocessing- or multi-socket-capable x86 microprocessors from Intel Corporation targeted at the non-consumer server, workstation and embedded system markets.-Overview:...

, and coprocessors like NVIDIA Tesla
Nvidia Tesla
The Tesla graphics processing unit is nVidia's third brand of GPUs. It is based on high-end GPUs from the G80 , as well as the Quadro lineup. Tesla is nVidia's first dedicated General Purpose GPU...

 GPGPUs, AMD GPUs, IBM Cell, FPGAs. Most modern supercomputers are now highly-tuned computer clusters using commodity processors combined with custom interconnects.

Relevant here is the distinction between capability computing and capacity computing, as defined by Graham et al. Capability computing is typically thought of as using the maximum computing power to solve a large problem in the shortest amount of time. Often a capability system is able to solve a problem of a size or complexity that no other computer can. Capacity computing in contrast is typically thought of as using efficient cost-effective computing power to solve somewhat large problems or many small problems or to prepare for a run on a capability system.

History

The history of supercomputing
History of supercomputing
The history of supercomputing goes back to the 1960s when a series of computers at Control Data Corporation were designed by Seymour Cray to use innovative designs and parallelism to achieve superior computational peak performance...

 goes back to the 1960s when a series of computers at Control Data Corporation
Control Data Corporation
Control Data Corporation was a supercomputer firm. For most of the 1960s, it built the fastest computers in the world by far, only losing that crown in the 1970s after Seymour Cray left the company to found Cray Research, Inc....

 (CDC) were designed by Seymour Cray
Seymour Cray
Seymour Roger Cray was an American electrical engineer and supercomputer architect who designed a series of computers that were the fastest in the world for decades, and founded Cray Research which would build many of these machines. Called "the father of supercomputing," Cray has been credited...

 to use innovative designs and parallelism to achieve superior computational peak performance. The CDC 6600
CDC 6600
The CDC 6600 was a mainframe computer from Control Data Corporation, first delivered in 1964. It is generally considered to be the first successful supercomputer, outperforming its fastest predecessor, IBM 7030 Stretch, by about three times...

, released in 1964, is generally considered the first supercomputer.

Cray left CDC in 1972 to form his own company. Four years after leaving CDC, Cray delivered the 80 MHz Cray 1 in 1976, and it become one of the most successful supercomputers in history. The Cray-2
Cray-2
The Cray-2 was a four-processor ECL vector supercomputer made by Cray Research starting in 1985. It was the fastest machine in the world when it was released, replacing the Cray Research X-MP designed by Steve Chen in that spot...

 released in 1985 was an 8 processor liquid cooled
Computer cooling
Computer cooling is required to remove the waste heat produced by computer components, to keep components within their safe operating temperature limits.Various cooling methods help to improve processor performance or reduce the noise of cooling fans....

 computer and Fluorinert
Fluorinert
Fluorinert is the trademarked brand name for the line of electronics coolant liquids sold commercially by 3M. It is an electrically insulating, stable fluorocarbon-based fluid which is used in various cooling applications. It is mainly used for cooling electronics...

 was pumped through it as it operated. It performed at 1.9 gigaflops and was the world's fastest until 1990.

While the supercomputers of the 1980s used only a few processors, in the 1990s, machines with thousands of processors began to appear both in the United States and in Japan, setting new computational performance records. Fujitsu's Numerical Wind Tunnel
Numerical Wind Tunnel
Numerical Wind Tunnel was an early implementation of the vector parallel architecture developed in a joint project between National Aerospace Laboratory of Japan and Fujitsu. It was the first supercomputer with a sustained performance of close to 100 Gflop/s for a wide range of fluid dynamics...

 supercomputer used 166 vector processors to gain the top spot in 1994 with a peak speed of 1.7 gigaflops per processor. The Hitachi SR2201
Hitachi SR2201
The HITACHI SR2201 was a distributed memory parallel system that was introduced in March 1996 by Hitachi. Its processor, the 150 MHz HARP-1E based on the PA-RISC 1.1 architecture, solved the cache miss penalty by pseudo vector processing . In PVP, data was loaded by prefetching to a special...

 obtained a peak performance of 600 gigaflops in 1996 by using 2048 processors connected via a fast three dimensional crossbar
Crossbar
- Structural engineering :* A primitive latch consisting of a post barring a door* The top tube of a bicycle frame* The horizontal member of many sports goals including those for hockey, association football, rugby league, rugby union and American football...

 network. The Intel Paragon
Intel Paragon
The Intel Paragon was a series of massively parallel supercomputers produced by Intel. The Paragon XP/S was a productized version of the experimental Touchstone Delta system built at Caltech, launched in 1992. The Paragon superseded Intel's earlier iPSC/860 system, to which it was closely...

 could have 1000 to 4000 Intel i860
Intel i860
The Intel i860 was a RISC microprocessor from Intel, first released in 1989. The i860 was one of Intel's first attempts at an entirely new, high-end instruction set since the failed Intel i432 from the 1980s...

 processors in various configurations, and was ranked the fastest in the world in 1993. The Paragon was a MIMD
MIMD
In computing, MIMD is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data...

 machine which connected processors via a high speed two dimensional mesh
Crossbar switch
In electronics, a crossbar switch is a switch connecting multiple inputs to multiple outputs in a matrix manner....

, allowing processes to execute on separate nodes; communicating via the Message Passing Interface
Message Passing Interface
Message Passing Interface is a standardized and portable message-passing system designed by a group of researchers from academia and industry to function on a wide variety of parallel computers...

.

Current research using supercomputers

The IBM Blue Gene/P computer has been used to simulate a number of artificial neurons equivalent to approximately one percent of a human cerebral cortex, containing 1.6 billion neurons with approximately 9 trillion connections. The same research group also succeeded in using a supercomputer to simulate a number of artificial neurons equivalent to the entirety of a rat's brain.

Modern-day weather forecasting also relies on supercomputers. The National Oceanic and Atmospheric Administration
National Oceanic and Atmospheric Administration
The National Oceanic and Atmospheric Administration , pronounced , like "noah", is a scientific agency within the United States Department of Commerce focused on the conditions of the oceans and the atmosphere...

 uses supercomputers to crunch hundreds of millions of observations to help make weather forecasts more accurate.

In 2011, the challenges and difficulties in pushing the envelope in supercomputing were underscored by IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

's abandonment of the Blue Waters
Blue Waters
Blue Waters is the name of a petascale supercomputer to be deployed at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign...

 petascale project.

This is a recent list of the computers which appeared at the top of the Top500
TOP500
The TOP500 project ranks and details the 500 most powerful known computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year...

 list, and the "Peak speed" is given as the "Rmax" rating. For more historical data see History of supercomputing
History of supercomputing
The history of supercomputing goes back to the 1960s when a series of computers at Control Data Corporation were designed by Seymour Cray to use innovative designs and parallelism to achieve superior computational peak performance...

.

Year Supercomputer Peak speed
(Rmax)
FLOPS
In computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...

 
Location
2008 IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 Roadrunner
1.026 PFLOPS DoE-Los Alamos National Laboratory
Los Alamos National Laboratory
Los Alamos National Laboratory is a United States Department of Energy national laboratory, managed and operated by Los Alamos National Security , located in Los Alamos, New Mexico...

, New Mexico
New Mexico
New Mexico is a state located in the southwest and western regions of the United States. New Mexico is also usually considered one of the Mountain States. With a population density of 16 per square mile, New Mexico is the sixth-most sparsely inhabited U.S...

, USA
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...

1.105 PFLOPS
2009 Cray
Cray
Cray Inc. is an American supercomputer manufacturer based in Seattle, Washington. The company's predecessor, Cray Research, Inc. , was founded in 1972 by computer designer Seymour Cray. Seymour Cray went on to form the spin-off Cray Computer Corporation , in 1989, which went bankrupt in 1995,...

 Jaguar
Jaguar (computer)
Jaguar is a petascale supercomputer built by Cray at Oak Ridge National Laboratory in Oak Ridge, Tennessee. The massively parallel Jaguar has a peak performance of just over 1,750 teraflops . It has 224,256 x86-based AMD Opteron processor cores, and operates with a version of Linux called the...

1.759 PFLOPS DoE-Oak Ridge National Laboratory
Oak Ridge National Laboratory
Oak Ridge National Laboratory is a multiprogram science and technology national laboratory managed for the United States Department of Energy by UT-Battelle. ORNL is the DOE's largest science and energy laboratory. ORNL is located in Oak Ridge, Tennessee, near Knoxville...

, Tennessee
Tennessee
Tennessee is a U.S. state located in the Southeastern United States. It has a population of 6,346,105, making it the nation's 17th-largest state by population, and covers , making it the 36th-largest by total land area...

, USA
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...

2010 Tianhe-I
Tianhe-I
Tianhe-I, Tianhe-1, or TH-1 , in English, "Milky Way Number One", is a supercomputer capable of an Rmax of 2.566 petaFLOPS...

A
2.566 PFLOPS National Supercomputing Center, Tianjin
Tianjin
' is a metropolis in northern China and one of the five national central cities of the People's Republic of China. It is governed as a direct-controlled municipality, one of four such designations, and is, thus, under direct administration of the central government...

, China
People's Republic of China
China , officially the People's Republic of China , is the most populous country in the world, with over 1.3 billion citizens. Located in East Asia, the country covers approximately 9.6 million square kilometres...

2011 Fujitsu
Fujitsu
is a Japanese multinational information technology equipment and services company headquartered in Tokyo, Japan. It is the world's third-largest IT services provider measured by revenues....

 K computer
K computer
The K computer – named for the Japanese word , which stands for 10 quadrillion – is a supercomputer being produced by Fujitsu at the RIKEN Advanced Institute for Computational Science campus in Kobe, Japan. In June 2011, TOP500 ranked K the world's fastest supercomputer, with a rating...

8.162 PFLOPS RIKEN
RIKEN
is a large natural sciences research institute in Japan. Founded in 1917, it now has approximately 3000 scientists on seven campuses across Japan, the main one in Wako, just outside Tokyo...

, Kobe
Kobe
, pronounced , is the fifth-largest city in Japan and is the capital city of Hyōgo Prefecture on the southern side of the main island of Honshū, approximately west of Osaka...

, Japan
Japan
Japan is an island nation in East Asia. Located in the Pacific Ocean, it lies to the east of the Sea of Japan, China, North Korea, South Korea and Russia, stretching from the Sea of Okhotsk in the north to the East China Sea and Taiwan in the south...

2011 Fujitsu
Fujitsu
is a Japanese multinational information technology equipment and services company headquartered in Tokyo, Japan. It is the world's third-largest IT services provider measured by revenues....

 K computer
K computer
The K computer – named for the Japanese word , which stands for 10 quadrillion – is a supercomputer being produced by Fujitsu at the RIKEN Advanced Institute for Computational Science campus in Kobe, Japan. In June 2011, TOP500 ranked K the world's fastest supercomputer, with a rating...

10.51 PFLOPS RIKEN
RIKEN
is a large natural sciences research institute in Japan. Founded in 1917, it now has approximately 3000 scientists on seven campuses across Japan, the main one in Wako, just outside Tokyo...

, Kobe
Kobe
, pronounced , is the fifth-largest city in Japan and is the capital city of Hyōgo Prefecture on the southern side of the main island of Honshū, approximately west of Osaka...

, Japan
Japan
Japan is an island nation in East Asia. Located in the Pacific Ocean, it lies to the east of the Sea of Japan, China, North Korea, South Korea and Russia, stretching from the Sea of Okhotsk in the north to the East China Sea and Taiwan in the south...


Hardware and software design

Supercomputers using custom CPUs traditionally gained their speed over conventional computers through the use of innovative designs that allow them to perform many tasks in parallel, as well as complex detail engineering. They tend to be specialized for certain types of computation, usually numerical calculations, and perform poorly at more general computing tasks. Their memory hierarchy
Memory hierarchy
The term memory hierarchy is used in the theory of computation when discussing performance issues in computer architectural design, algorithm predictions, and the lower level programming constructs such as involving locality of reference. A 'memory hierarchy' in computer storage distinguishes each...

 is very carefully designed to ensure the processor is kept fed with data and instructions at all times — in fact, much of the performance difference between slower computers and supercomputers is due to the memory hierarchy. Their I/O systems tend to be designed to support high bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

, with latency less of an issue, because supercomputers are not used for transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...

.

As with all highly parallel systems, Amdahl's law
Amdahl's law
Amdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved...

 applies, and supercomputer designs devote great effort to eliminating software serialization, and using hardware to address the remaining bottlenecks.

Energy consumption and heat management

A typical supercomputer consumes large amounts of electrical power, almost all of which is converted into heat, requiring cooling. For example, Tianhe-1A consumes 4.04 Megawatts of electricity. The cost to power and cool the system can be significant, e.g. 4MW at $0.10/KWh is $400 an hour or about $3.5 million per year.

Heat management is a major issue in complex electronic devices, and affects powerful computer systems in various ways. The thermal design power
Thermal Design Power
The thermal design power , sometimes called thermal design point, refers to the maximum amount of power the cooling system in a computer is required to dissipate. For example, a laptop's CPU cooling system may be designed for a 20 watt TDP, which means that it can dissipate up to 20 watts of heat...

 and CPU power dissipation
CPU power dissipation
Central processing unit power dissipation or CPU power dissipation is the process in which central processing units consume electrical energy, and dissipate this energy both by the action of the switching devices contained in the CPU and by the energy lost in the form of heat due to the impedance...

 issues in supercomputing surpass those of traditional computer cooling
Computer cooling
Computer cooling is required to remove the waste heat produced by computer components, to keep components within their safe operating temperature limits.Various cooling methods help to improve processor performance or reduce the noise of cooling fans....

 technologies. The supercomputing awards for green computing
Green computing
Green computing or green IT, refers to environmentally sustainable computing or IT. In the article Harnessing Green IT: Principles and Practices, San Murugesan defines the field of green computing as "the study and practice of designing, manufacturing, using, and disposing of computers, servers,...

 reflect this issue.
The packing of thousands of processors together inevitably generates significant amounts of heat density that need to be dealt with. The Cray 2 was liquid cooled
Computer cooling
Computer cooling is required to remove the waste heat produced by computer components, to keep components within their safe operating temperature limits.Various cooling methods help to improve processor performance or reduce the noise of cooling fans....

, and used a Fluorinert
Fluorinert
Fluorinert is the trademarked brand name for the line of electronics coolant liquids sold commercially by 3M. It is an electrically insulating, stable fluorocarbon-based fluid which is used in various cooling applications. It is mainly used for cooling electronics...

 "cooling waterfall" which was forced through the modules under pressure. However, the submerged liquid cooling approach was not practical for the multi-cabinet systems based on off-the-shelf processors, and in System X
System X
X System or System X may refer to:System X* IBM System x, server platform* System X * System X , supercomputer* System X , digital switching platformX System* Esperanto orthography#X-system in Esperanto orthography...

 a special cooling system that combined air conditioning with liquid cooling was developed in conjunction with the Liebert company
Liebert (company)
Liebert Corporation, a business of the Emerson Network Power platform of Emerson Electric, is a global manufacturer of power, precision cooling and infrastructure management systems for main-frame computer, server racks, and critical process systems...

.

In the Blue Gene
Blue Gene
Blue Gene is a computer architecture project to produce several supercomputers, designed to reach operating speeds in the PFLOPS range, and currently reaching sustained speeds of nearly 500 TFLOPS . It is a cooperative project among IBM Blue Gene is a computer architecture project to produce...

 system IBM deliberately used low power processors to deal with heat density.
On the other hand, the IBM Power 775, released in 2011, has closely packed elements that require water cooling. The IBM Aquasar system, on the other hand uses hot water cooling to achieve energy efficiency, the water being used to heat buildings as well.

The energy efficiency of computer systems is generally measured in terms of "FLOPS per Watt". In 2008 IBM's Roadrunner operated at 376 MFLOPS/Watt. In November 2010, the Blue Gene/Q reached 1684 MFLOPS/Watt. In June 2011 the top 2 spots on the Green 500 list were occupied by Blue Gene
Blue Gene
Blue Gene is a computer architecture project to produce several supercomputers, designed to reach operating speeds in the PFLOPS range, and currently reaching sustained speeds of nearly 500 TFLOPS . It is a cooperative project among IBM Blue Gene is a computer architecture project to produce...

 machines in New York (one achieving 2097 MFLOPS/W) with the DEGIMA cluster
DEGIMA (computer cluster)
The DEGIMA is a high performance computer cluster used for hierarchical N-body simulations at the Nagasaki Advanced Computing Center, Nagasaki University....

 in Nagasaki placing third with 1375 MFLOPS/W.

Supercomputer challenges, technologies

Information cannot move faster than the speed of light
Speed of light
The speed of light in vacuum, usually denoted by c, is a physical constant important in many areas of physics. Its value is 299,792,458 metres per second, a figure that is exact since the length of the metre is defined from this constant and the international standard for time...

 between two parts of a supercomputer. For this reason, a supercomputer that is many meters across must have latencies between its components measured at least in the tens of nanoseconds. Seymour Cray's supercomputer designs attempted to keep cable runs as short as possible for this reason, hence the cylindrical shape of his Cray range of computers. In modern supercomputers built of many conventional CPUs running in parallel, latencies of 1–5 microseconds to send a message between CPUs are typical.

Supercomputers consume and produce massive amounts of data in a very short period of time. According to Ken Batcher
Ken Batcher
Ken Batcher is an emeritus professor of Computer Science at Kent State University. He also worked as a computer architect at Goodyear Aerospace in Akron, Ohio for 28 years. In 1964, Batcher received his Ph.D. in electrical engineering from the University of Illinois...

, "A supercomputer is a device for turning compute-bound
CPU bound
In computer science, CPU bound is when the time for a computer to complete a task is determined principally by the speed of the central processor: processor utilization is high, perhaps at 100% usage for many seconds or minutes...

 problems into I/O-bound
IO bound
In computer science, I/O bound refers to a condition in which the time it takes to complete a computation is determined principally by the period of time spent waiting for input/output operations to be completed. This is the opposite of a task being CPU bound...

 problems." Much work on external storage bandwidth is needed to ensure that this information can be transferred quickly and stored/retrieved correctly.

Technologies developed for supercomputers include:
  • Vector processing
  • Liquid cooling
  • Non-Uniform Memory Access (NUMA)
    Non-Uniform Memory Access
    Non-Uniform Memory Access is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor...

  • Striped disks
    Data striping
    In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, in a way that accesses of sequential segments are made to different physical storage devices. Striping is useful when a processing device requests access to data more quickly than a...

     (the first instance of what was later called RAID)
  • Parallel filesystems

Processing techniques

Vector processing techniques were first developed for supercomputers and continue to be used in specialist high-performance applications.
Vector processing techniques have trickled down to the mass market in DSP architectures and SIMD
SIMD
Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...

 (Single Instruction Multiple Data) processing instructions for general-purpose computers.

Modern video game consoles in particular use SIMD extensively and this is the basis for some manufacturers' claim that their game machines are themselves supercomputers. Indeed, some graphics cards have the computing power of several TeraFLOPS
FLOPS
In computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...

. The applications to which this power can be applied was limited by the special-purpose nature of early video processing. As video processing has become more sophisticated, graphics processing unit
Graphics processing unit
A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

s (GPUs) have evolved to become more useful as general-purpose vector processors, and an entire computer science sub-discipline has arisen to exploit this capability: General-Purpose Computing on Graphics Processing Units (GPGPU
GPGPU
General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

).

The current Top500 list (from May 2010) has 3 supercomputers based on GPGPUs. In particular, the number 4 supercomputer, Nebulae built by Dawning in China, is based on GPGPUs.

Operating systems

Supercomputers today most often use variants of the Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 as shown by the graph to the right.

Until the early-to-mid-1980s, supercomputers usually sacrificed instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...

 compatibility and code portability for performance (processing and memory access speed). For the most part, supercomputers to this time (unlike high-end mainframes) had vastly different operating systems. The Cray-1 alone had at least six different proprietary OSs largely unknown to the general computing community. In a similar manner, different and incompatible vectorizing and parallelizing compilers for Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 existed. This trend would have continued with the ETA-10 were it not for the initial instruction set compatibility between the Cray-1 and the Cray X-MP, and the adoption of computer systems such as Cray's Unicos
Unicos
UNICOS is the name of a range of Unix-like operating system variants developed by Cray for its supercomputers. UNICOS is the successor of the Cray Operating System . It provides network clustering and source code compatibility layers for some other Unixes. UNICOS was originally introduced in 1985...

, or Linux.

Programming

The parallel architectures of supercomputers often dictate the use of special programming techniques to exploit their speed. The base language of supercomputer code is, in general, Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 or C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, using special libraries to share data between nodes. In the most common scenario, environments such as PVM
Parallel Virtual Machine
The Parallel Virtual Machine is a software tool for parallel networking of computers. It is designed to allow a network of heterogeneous Unix and/or Windows machines to be used as a single distributed parallel processor. Thus large computational problems can be solved more cost effectively by...

 and MPI
Message Passing Interface
Message Passing Interface is a standardized and portable message-passing system designed by a group of researchers from academia and industry to function on a wide variety of parallel computers...

 for loosely connected clusters and OpenMP
OpenMP
OpenMP is an API that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most processor architectures and operating systems, including Linux, Unix, AIX, Solaris, Mac OS X, and Microsoft Windows platforms...

 for tightly coordinated shared memory machines are used. Significant effort is required to optimize an algorithm for the interconnect characteristics of the machine it will be run on; the aim is to prevent any of the CPUs from wasting time waiting on data from other nodes. GPGPU
GPGPU
General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

s have hundreds of processor cores and are programmed using programming models such as CUDA
CUDA
CUDA or Compute Unified Device Architecture is a parallel computing architecture developed by Nvidia. CUDA is the computing engine in Nvidia graphics processing units that is accessible to software developers through variants of industry standard programming languages...

 and OpenCL
OpenCL
OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. OpenCL includes a language for writing kernels , plus APIs that are used to define and then control the platforms...

.

Software tools

Software tools for distributed processing include standard APIs
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

 such as MPI
Message Passing Interface
Message Passing Interface is a standardized and portable message-passing system designed by a group of researchers from academia and industry to function on a wide variety of parallel computers...

 and PVM
Parallel Virtual Machine
The Parallel Virtual Machine is a software tool for parallel networking of computers. It is designed to allow a network of heterogeneous Unix and/or Windows machines to be used as a single distributed parallel processor. Thus large computational problems can be solved more cost effectively by...

, VTL
Virtual Tape Library
A virtual tape library is a data storage virtualization technology used typically for backup and recovery purposes. A VTL presents a storage component as tape libraries or tape drives for use with existing backup software.Virtualizing the disk storage as tape allows integration of VTLs with...

, and open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

-based software solutions such as Beowulf
Beowulf (computing)
A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them...

, WareWulf
WareWulf
Warewulf is a computer cluster implementation toolkit that facilitates the process of installing a cluster and long term administration. It does this by changing the administration paradigm to make all of the slave node file systems manageable from one point, and automate the distribution of the...

, and openMosix
OpenMosix
openMosix was a free cluster management system that provided single-system image capabilities, e.g. automatic work distribution among nodes. It allowed program processes to migrate to machines in the node's network that would be able to run that process faster...

, which facilitate the creation of a supercomputer from a collection of ordinary workstations or servers. Technology like ZeroConf (Rendezvous/Bonjour)
Zeroconf
Zero configuration networking , is a set of techniques that automatically creates a usable Internet Protocol network without manual operator intervention or special configuration servers....

 can be used to create ad hoc computer clusters for specialized software such as Apple's
Apple Computer
Apple Inc. is an American multinational corporation that designs and markets consumer electronics, computer software, and personal computers. The company's best-known hardware products include the Macintosh line of computers, the iPod, the iPhone and the iPad...

 Shake
Shake (software)
Shake is a discontinued image compositing package used in the post-production industry. Shake was widely used in visual effects and digital compositing for film, HD and commercials. Shake exposes its node graph architecture graphically. It enables complex image processing sequences to be designed...

 compositing application. An easy programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 for supercomputers remains an open research topic in computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

. Several utilities that would once have cost several thousands of dollars are now completely free thanks to the open source community that often creates disruptive technology
Disruptive technology
A disruptive technology or disruptive innovation is an innovation that helps create a new market and value network, and eventually goes on to disrupt an existing market and value network , displacing an earlier technology there...

.

Modern supercomputer architecture

Supercomputers today often have a similar top-level architecture consisting of a cluster of MIMD
MIMD
In computing, MIMD is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data...

 multiprocessors, each processor of which is SIMD, and with each multiprocessor controlling multiple co-processors. The supercomputers vary radically with respect to the number of multiprocessors per cluster, the number of processors per multiprocessor, the number of simultaneous instructions per SIMD processor, and the type and number of co-processors. Within this hierarchy we have:
  • A computer cluster is a collection of computers that are highly interconnected via a high-speed network or switching fabric. Each computer runs under a separate instance of an Operating System
    Operating system
    An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

     (OS).
  • A multiprocessing computer is a computer, operating under a single instance of an OS and using more than one CPU core, wherein the application-level software is indifferent to the number of CPU cores. The cores share tasks using Symmetric multiprocessing
    Symmetric multiprocessing
    In computing, symmetric multiprocessing involves a multiprocessor computer hardware architecture where two or more identical processors are connected to a single shared main memory and are controlled by a single OS instance. Most common multiprocessor systems today use an SMP architecture...

     (SMP) and Non-Uniform Memory Access
    Non-Uniform Memory Access
    Non-Uniform Memory Access is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor...

     (NUMA). The cores may all be in from one to thousands of multicore processor devices.
  • A SIMD core executes the same instruction on more than one set of data at the same time. The core may be a general purpose commodity core or special-purpose vector processor
    Vector processor
    A vector processor, or array processor, is a central processing unit that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors. This is in contrast to a scalar processor, whose instructions operate on single data items...

    . It may be in a high-performance processor or a low power processor. As of 2007, each core executes several SIMD instructions per nanosecond.
  • A co-processor is incapable of executing "standard" code, but with specialized programming can exceed the performance of the multiprocessor by several orders of magnitude for certain applications. Co-processors are often GPGPU
    GPGPU
    General-purpose computing on graphics processing units is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU...

    s. The ratio of coprocessors to general-purpose processors varies dramatically, The benchmark used for measuring TOP500 performance disregards the contribution of co-processors.


As of October 2010 the fastest supercomputer in the world is the K computer
K computer
The K computer – named for the Japanese word , which stands for 10 quadrillion – is a supercomputer being produced by Fujitsu at the RIKEN Advanced Institute for Computational Science campus in Kobe, Japan. In June 2011, TOP500 ranked K the world's fastest supercomputer, with a rating...

 which has over 68,000 8-core processors, while Tianhe-1A system at National University of Defense Technology
National University of Defense Technology
National University of Defense Technology is a comprehensive national key university based in Changsha, Hunan Province, China.It is under the dual supervision of the Ministry of National Defense and the Ministry of Education, designated for Project 211 and Project 985, the two national plans for...

 comes at second number with more than 14,000 multi-core processors.

In February 2009, IBM also announced work on "Sequoia," which appears to be a 20 petaflops supercomputer. This will be equivalent to 2 million laptops (whereas Roadrunner is comparable to a mere 100,000 laptops). It is slated for deployment in late 2011. The Sequoia will be powered by 1.6 million cores (specific 45-nanometer chips in development) and 1.6 petabytes of memory. It will be housed in 96 refrigerators spanning roughly 3000 ft2.

Moore's Law
Moore's Law
Moore's law describes a long-term trend in the history of computing hardware: the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years....

 and economies of scale are the dominant factors in supercomputer design. The design concepts that allowed past supercomputers to out-perform desktop machines of the time tended to be gradually incorporated into commodity PCs. Furthermore, the costs of chip development and production make it uneconomical to design custom chips for a small run and favor mass-produced chips that have enough demand to recoup the cost of production. A current model quad-core Xeon workstation running at 2.66 GHz will outperform a multimillion dollar Cray C90 supercomputer used in the early 1990s; most workloads requiring such a supercomputer in the 1990s can be done on workstations costing less than 4,000 US dollars as of 2010. Supercomputing is taking a step of increasing density, allowing for desktop supercomputers to become available, offering the computer power that in 1998 required a large room to require less than a desktop footprint.

In addition, many problems carried out by supercomputers are particularly suitable for parallelization (in essence, splitting up into smaller parts to be worked on simultaneously) and, in particular, fairly coarse-grained parallelization that limits the amount of information that needs to be transferred between independent processing units. For this reason, traditional supercomputers can be replaced, for many applications, by "clusters" of computers of standard design, which can be programmed to act as one large computer.

Special-purpose supercomputers

A special-purpose supercomputer is a high-performance computing device with a hardware architecture dedicated to a single problem.
This allows the use of specially programmed FPGA
Field-programmable gate array
A field-programmable gate array is an integrated circuit designed to be configured by the customer or designer after manufacturing—hence "field-programmable"...

 chips or even custom VLSI
Very-large-scale integration
Very-large-scale integration is the process of creating integrated circuits by combining thousands of transistors into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device.The first semiconductor...

 chips, allowing higher price/performance ratios by sacrificing generality.
They are used for applications such as astrophysics
Astrophysics
Astrophysics is the branch of astronomy that deals with the physics of the universe, including the physical properties of celestial objects, as well as their interactions and behavior...

 computation and brute-force codebreaking.
Historically a new special-purpose supercomputer has occasionally been faster than the world's fastest general-purpose supercomputer, by some measure. For example, GRAPE-6 was faster than the Earth Simulator
Earth Simulator
The Earth Simulator , developed by the Japanese government's initiative "Earth Simulator Project", was a highly parallel vector supercomputer system for running global climate models to evaluate the effects of global warming and problems in solid earth geophysics...

 in 2002 for a particular special set of problems.

Examples of special-purpose supercomputers:
  • Belle
    Belle (chess machine)
    Belle was the name of a chess computer and its associated software, developed by Joe Condon and Ken Thompson at Bell Labs in the 1970s and 1980s. Belle was the first computer built for the sole purpose of chess playing. The strongest computer chess system of its time, Belle achieved a USCF rating...

    , Deep Blue, and Hydra
    Hydra (chess)
    Hydra was a chess machine, designed by a team with Dr. Christian "Chrilly" Donninger, Dr. Ulf Lorenz, GM Christopher Lutz and Muhammad Nasir Ali. Since 2006 the development team consised only of Donninger and Lutz. Hydra was under the patronage of the PAL Group and Sheikh Tahnoon Bin Zayed Al...

    , for playing chess
    Chess
    Chess is a two-player board game played on a chessboard, a square-checkered board with 64 squares arranged in an eight-by-eight grid. It is one of the world's most popular games, played by millions of people worldwide at home, in clubs, online, by correspondence, and in tournaments.Each player...

  • Reconfigurable computing
    Reconfigurable computing
    Reconfigurable computing is a computer architecture combining some of the flexibility of software with the high performance of hardware by processing with very flexible high speed computing fabrics like field-programmable gate arrays...

     machines or parts of machines
  • GRAPE
    Gravity Pipe
    Gravity Pipe, otherwise known as GRAPE, is a project which uses hardware acceleration to perform gravitational computations. Integrated with Beowulf-style commodity computers, the GRAPE system calculates the force of gravity that a given mass, such as a star, exerts on others...

    , for astrophysics and molecular dynamics
  • Deep Crack, for breaking the DES
    Data Encryption Standard
    The Data Encryption Standard is a block cipher that uses shared secret encryption. It was selected by the National Bureau of Standards as an official Federal Information Processing Standard for the United States in 1976 and which has subsequently enjoyed widespread use internationally. It is...

     cipher
    Cipher
    In cryptography, a cipher is an algorithm for performing encryption or decryption — a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. In non-technical usage, a “cipher” is the same thing as a “code”; however, the concepts...

  • MDGRAPE-3, for protein structure computation
  • D. E. Shaw Research
    D. E. Shaw Research
    D. E. Shaw Research is a computational biochemistry research laboratory based in New York City. Under the scientific direction of David E. Shaw, the group's chief scientist, D. E...

     Anton
    Anton (computer)
    Anton is a massively parallel supercomputer designed and built by D. E. Shaw Research in New York. It is a special-purpose system for molecular dynamics simulations of proteins and other biological macromolecules...

    , for simulating molecular dynamics
    Molecular dynamics
    Molecular dynamics is a computer simulation of physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a period of time, giving a view of the motion of the atoms...


Measuring supercomputer speed

In general, the speed of a supercomputer is measured in "FLOPS
FLOPS
In computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...

" (FLoating Point Operations Per Second), commonly used with an SI prefix
SI prefix
The International System of Units specifies a set of unit prefixes known as SI prefixes or metric prefixes. An SI prefix is a name that precedes a basic unit of measure to indicate a decadic multiple or fraction of the unit. Each prefix has a unique symbol that is prepended to the unit symbol...

 such as tera-, combined into the shorthand "TFLOPS" (1012 FLOPS, pronounced teraflops), or peta-, combined into the shorthand "PFLOPS" (1015 FLOPS, pronounced petaflops.) This measurement
Measurement
Measurement is the process or the result of determining the ratio of a physical quantity, such as a length, time, temperature etc., to a unit of measurement, such as the metre, second or degree Celsius...

 is either quoted based on the theoretical floating point performance of a processor (derived from manufacturer's processor specifications and shown as Rpeak in the top500 lists)- which is generally unachievable when running real workloads, or the achievable throughput (derived from benchmarks using the Linpack benchmark
Benchmark (computing)
In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it...

 and shown as Rmax in the top500 lists). The Linpack benchmark does LU decomposition
LU decomposition
In linear algebra, LU decomposition is a matrix decomposition which writes a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. This decomposition is used in numerical analysis to solve systems of linear...

 of a large matrix. The Linpack performance gives some indication of performance for some real-world problems, but does not necessarily match the processing requirements of many other supercomputer workloads, which for example may require more memory bandwidth than Linpack, or may require better integer computing performance, or may need a high performance I/O system to achieve high levels of performance.

"Petascale
Petascale
In computing, petascale refers to a computer system capable of reaching performance in excess of one petaflop, i.e. one quadrillion floating point operations per second. The standard benchmark tool is LINPACK and Top500.org is the organisation which tracks the fastest supercomputers...

" supercomputers can process one quadrillion (1015) (1000 trillion) FLOPS. Exascale
Exascale computing
Exascale computing is a 21st-century attempt to move computing capabilities beyond the existing petascale. If achieved, it would represent a thousandfold increase over that scale...

 is computing performance in the exaflops range. An exaflop is one quintillion (1018) FLOPS (one million teraflops).

The TOP500 list

Since 1993, the fastest supercomputers have been ranked on the TOP500
TOP500
The TOP500 project ranks and details the 500 most powerful known computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year...

 list according to their LINPACK
LINPACK
LINPACK is a software library for performing numerical linear algebra on digital computers. It was written in Fortran by Jack Dongarra, Jim Bunch, Cleve Moler, and Gilbert Stewart, and was intended for use on supercomputers in the 1970s and early 1980s...

 benchmark results. The list does not claim to be unbiased or definitive, but it is a widely cited current definition of the "fastest" supercomputer available at any given time.

Current fastest supercomputer system

The K computer
K computer
The K computer – named for the Japanese word , which stands for 10 quadrillion – is a supercomputer being produced by Fujitsu at the RIKEN Advanced Institute for Computational Science campus in Kobe, Japan. In June 2011, TOP500 ranked K the world's fastest supercomputer, with a rating...

 is the worlds fastest supercomputer at 10.51 petaFLOPS. It consists of 88,000 SPARC64 VIIIfx CPUs, and spans 864 server racks. Fujitsu was not able to give the official power consumption of the completed K cluster, but in June, when it reached a one petaflop peak, it consumed 9.89 megawatts, costing $9.89 million dollars a year.

Opportunistic Supercomputing

Opportunistic Supercomputing is a form of networked grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

 whereby a “super virtual computer” of many loosely coupled
Loose coupling
In computing and systems design a loosely coupled system is one where each of its components has, or makes use of, little or no knowledge of the definitions of other separate components. The notion was introduced into organizational studies by Karl Weick...

 volunteer computing machines performs very large computing tasks. Grid computing has been applied to a number of large-scale embarrassingly parallel
Embarrassingly parallel
In parallel computing, an embarrassingly parallel workload is one for which little or no effort is required to separate the problem into a number of parallel tasks...

 problems that require supercomputing performance scales. However, basic grid and cloud computing
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

 approaches that rely on volunteer computing
Volunteer computing
Volunteer computing is a type of distributed computing in which computer owners donate their computing resources to one or more "projects".-History:...

 can not handle traditional supercomputing tasks such as fluid dynamic simulations.

Examples of Opportunistic Supercomputing Systems

The fastest grid computing system, Folding@home
Folding@home
Folding@home is a distributed computing project designed to use spare processing power on personal computers to perform simulations of disease-relevant protein folding and other molecular dynamics, and to improve on the methods of doing so...

, which is based on BOINC, reported 8.8 petaflops of processing power . Of this, 7.1 petaflops are contributed by clients running on various GPUs, 1.8 petaflops come from PlayStation 3
PlayStation 3
The is the third home video game console produced by Sony Computer Entertainment and the successor to the PlayStation 2 as part of the PlayStation series. The PlayStation 3 competes with Microsoft's Xbox 360 and Nintendo's Wii as part of the seventh generation of video game consoles...

 systems, and the rest from various computer systems.

The BOINC platform hosts a number of distributed computing projects. , BOINC recorded a processing power of over 5.5 petaflops through over 480,000 active computers on the network The most active project (measured by computational power), MilkyWay@home
MilkyWay@Home
MilkyWay@home is a volunteer distributed computing project in astrophysics running on the Berkeley Open Infrastructure for Network Computing platform...

, reports processing power of over 700 teraflops
FLOPS
In computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...

 through over 33,000 active computers.

, GIMPS's
Great Internet Mersenne Prime Search
The Great Internet Mersenne Prime Search is a collaborative project of volunteers who use freely available computer software to search for Mersenne prime numbers. The project was founded by George Woltman, who also wrote the software Prime95 and MPrime for the project...

 distributed Mersenne Prime
Mersenne prime
In mathematics, a Mersenne number, named after Marin Mersenne , is a positive integer that is one less than a power of two: M_p=2^p-1.\,...

 search currently achieves about 60 teraflops through over 25,000 registered computers. The Internet PrimeNet Server supports GIMPS's grid computing approach, one of the earliest and most successful grid computing projects, since 1997.

Quasi-opportunistic Supercomputing

Quasi-opportunistic Supercomputing is a form of distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

 whereby the “super virtual computer” of a large number of networked geographically disperse computers performs huge processing power demanding computing tasks. Quasi-opportunistic supercomputing aims to provide a higher quality of service than opportunistic grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

 by achieving more control over the assignment of tasks to distributed resources and the use of intelligence about the availability and reliability of individual systems within the supercomputing network. However, quasi-opportunistic distributed execution of demanding parallel computing software in grids should be achieved through implementation of grid-wise allocation agreements, co-allocation subsystems, communication topology-aware allocation mechanisms, fault tolerant message passing libraries and data pre-conditioning.

Examples of Quasi-opportunistic Supercomputing Systems

The PlayStation 3
PlayStation 3
The is the third home video game console produced by Sony Computer Entertainment and the successor to the PlayStation 2 as part of the PlayStation series. The PlayStation 3 competes with Microsoft's Xbox 360 and Nintendo's Wii as part of the seventh generation of video game consoles...

 Gravity Grid uses a network of 16 machines, and exploits the Cell processor
Cell (microprocessor)
Cell is a microprocessor architecture jointly developed by Sony, Sony Computer Entertainment, Toshiba, and IBM, an alliance known as "STI". The architectural design and first implementation were carried out at the STI Design Center in Austin, Texas over a four-year period beginning March 2001 on a...

 for the intended application, which is performing astrophysical simulations of large supermassive black holes capturing smaller compact objects. The Cell processor
Cell (microprocessor)
Cell is a microprocessor architecture jointly developed by Sony, Sony Computer Entertainment, Toshiba, and IBM, an alliance known as "STI". The architectural design and first implementation were carried out at the STI Design Center in Austin, Texas over a four-year period beginning March 2001 on a...

 has a main CPU and 6 floating-point vector processors, giving the machine a net of 16 general-purpose machines and 96 vector processors. This cluster was built in 2007 by Dr. Gaurav Khanna, a professor in the Physics Department of the University of Massachusetts Dartmouth
University of Massachusetts Dartmouth
The University of Massachusetts Dartmouth is one of five campuses and operating subdivisions of the University of Massachusetts . It is located in North Dartmouth, Massachusetts, United States, in the center of the South Coast region, between the cities of New Bedford to the east and Fall River...

 with support from Sony Computer Entertainment
Sony Computer Entertainment
Sony Computer Entertainment, Inc. is a major video game company specializing in a variety of areas in the video game industry, and is a wholly owned subsidiary and part of the Consumer Products & Services Group of Sony...

 and is the first PS3 cluster that generated numerical results that were published in scientific research literature.

Also a "quasi-supercomputer" is Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

's search engine system
Google platform
Google requires large computational resources in order to provide their services. This article describes the technological infrastructure behind Google's websites, as presented in the company's public announcements.-Original hardware:...

 with estimated total processing power of between 126 and 316 teraflops, as of April 2004. In June 2006 the New York Times estimated that the Googleplex
Googleplex
The Googleplex is the corporate headquarters complex of Google, Inc., located at 1600 Amphitheatre Parkway in Mountain View, Santa Clara County, California, United States, near San Jose. "Googleplex" is a portmanteau of Google and complex, and a reference to googolplex, the name given to the large...

 and its server farm
Server farm
A server farm or server cluster is a collection of computer servers usually maintained by an enterprise to accomplish server needs far beyond the capability of one machine. Server farms often have backup servers, which can take over the function of primary servers in the event of a primary server...

s contain 450,000 servers. According to 2008 estimates, the processing power of Google's cluster might reach from 20 to 100 petaflops.

Other notable computer clusters are the flash mob cluster
Flash mob computing
Flash mob computing is a temporary ad-hoc computer cluster running specific software to coordinate the individual computers into one single supercomputer...

, the Qoscos Grid
Qoscos Grid
The Qoscos Grid is a quasi-opportunistic supercomputing system using grid computing.QosCos Grid acts as middleware resource management facilities which provide end-users with supercomputer-like performance by connecting many computing clusters together...

 and the Beowulf cluster. The flash mob cluster allows the use of any computer in the network, while the Beowulf cluster still requires uniform architecture.

Research and development

IBM is developing the Cyclops64
Cyclops64
Cyclops64 is a cellular architecture in development by IBM. The Cyclops64 project aims to create the first "supercomputer on a chip".-History:...

 architecture, intended to create a "supercomputer on a chip".

Other PFLOPS projects include one by Narendra Karmarkar
Narendra Karmarkar
Narendra K. Karmarkar is an Indian mathematician, renowned for developing Karmarkar's algorithm. He is listed as an ISI highly cited researcher.- Biography :...

 in India
India
India , officially the Republic of India , is a country in South Asia. It is the seventh-largest country by geographical area, the second-most populous country with over 1.2 billion people, and the most populous democracy in the world...

, a C-DAC
Centre for Development of Advanced Computing
Centre for Development of Advanced Computing is a research and development organization under the Department of Information Technology, India.-History:...

 effort targeted for 2010, and the Blue Waters Petascale Computing System
Blue Waters
Blue Waters is the name of a petascale supercomputer to be deployed at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign...

 funded by the NSF
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

 ($200 million) that is being built by the NCSA
National Center for Supercomputing Applications
The National Center for Supercomputing Applications is an American state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances science and engineering. NCSA operates as a unit of the University of Illinois at Urbana-Champaign but it provides high-performance...

 at the University of Illinois at Urbana-Champaign
University of Illinois at Urbana-Champaign
The University of Illinois at Urbana–Champaign is a large public research-intensive university in the state of Illinois, United States. It is the flagship campus of the University of Illinois system...

 (slated to be completed by 2011).

In May 2008 a collaboration was announced between NASA
NASA
The National Aeronautics and Space Administration is the agency of the United States government that is responsible for the nation's civilian space program and for aeronautics and aerospace research...

, SGI
Silicon Graphics
Silicon Graphics, Inc. was a manufacturer of high-performance computing solutions, including computer hardware and software, founded in 1981 by Jim Clark...

 and Intel
Intel Corporation
Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States and the world's largest semiconductor chip maker, based on revenue. It is the inventor of the x86 series of microprocessors, the processors found in most...

 to build a 1 petaflops computer, Pleiades
Pleiades (supercomputer)
Pleiades is a petascale supercomputer built by SGI at NASA Ames Research Center in Mountain View, California. , it was the world's seventh fastest computer with a peak performance of more than 970 teraflops. After further extensions, Pleiades is scheduled to reach 10 petaflops in 2012.-See...

, in 2009, scaling up to 10 PFLOPs by 2012. Meanwhile, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 is constructing a 20 PFLOPs supercomputer at Lawrence Livermore National Laboratory
Lawrence Livermore National Laboratory
The Lawrence Livermore National Laboratory , just outside Livermore, California, is a Federally Funded Research and Development Center founded by the University of California in 1952...

, named Sequoia
IBM Sequoia
Sequoia is a petascale Blue Gene/Q supercomputer being constructed by IBM for the National Nuclear Security Administration as part of the Advanced Simulation and Computing Program...

, based on the Blue Gene
Blue Gene
Blue Gene is a computer architecture project to produce several supercomputers, designed to reach operating speeds in the PFLOPS range, and currently reaching sustained speeds of nearly 500 TFLOPS . It is a cooperative project among IBM Blue Gene is a computer architecture project to produce...

 architecture which is scheduled to go online in 2011.

Given the current speed of progress, supercomputers are projected to reach 1 exaflops (1018) (one quintillion FLOPS) in 2019. Using the Intel MIC
Intel MIC
Intel Many Integrated Core Architecture or Intel MIC is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee multicore architecture, the Teraflops Research Chip multicore chip research project and the Intel Single-chip Cloud Computer multicore...

 multi-core processor architecture, which is Intel's response to GPU systems, SGI plans to achieve a 500 times increase in performance by 2018 to achieve an exaflop. Samples of MIC chips with 32 cores which combine vector processing units with standard CPU have become available.

On October 11, 2011, the Oak Ridge National Laboratory
Oak Ridge National Laboratory
Oak Ridge National Laboratory is a multiprogram science and technology national laboratory managed for the United States Department of Energy by UT-Battelle. ORNL is the DOE's largest science and energy laboratory. ORNL is located in Oak Ridge, Tennessee, near Knoxville...

 announced they were building a 20 petaflop supercomputer, named Titan, which will become operational in 2012, the hybrid Titan system will combine AMD Opteron processors with “Kepler” NVIDIA Tesla graphic processing unit (GPU) technologies.

Erik P. DeBenedictis of Sandia National Laboratories
Sandia National Laboratories
The Sandia National Laboratories, managed and operated by the Sandia Corporation , are two major United States Department of Energy research and development national laboratories....

 theorizes that a zettaflops (1021) (one sextillion FLOPS) computer is required to accomplish full weather modeling
Weather forecasting
Weather forecasting is the application of science and technology to predict the state of the atmosphere for a given location. Human beings have attempted to predict the weather informally for millennia, and formally since the nineteenth century...

, which could cover a two week time span accurately. Such systems might be built around 2030.

Applications of supercomputers

Decade Uses and computer involved
1970s Weather forecasting, aerodynamic research (Cray-1
Cray-1
The Cray-1 was a supercomputer designed, manufactured, and marketed by Cray Research. The first Cray-1 system was installed at Los Alamos National Laboratory in 1976, and it went on to become one of the best known and most successful supercomputers in history...

).
1980s Probabilistic analysis, radiation shielding modeling (CDC Cyber
CDC Cyber
The CDC Cyber range of mainframe-class supercomputers were the primary products of Control Data Corporation during the 1970s and 1980s. In their day, they were the computer architecture of choice for scientific and mathematically intensive computing...

).
1990s Brute force code breaking (EFF DES cracker
EFF DES cracker
In cryptography, the EFF DES cracker is a machine built by the Electronic Frontier Foundation in 1998 to perform a brute force search of DES cipher's key space — that is, to decrypt an encrypted message by trying every possible key...

),

3D nuclear test simulations as a substitute for legal conduct Nuclear Proliferation Treaty (ASCI Q
ASCI Q
The ASCI Q was a supercomputer at the Los Alamos National Laboratory, installed in 2003. It was a DEC AlphaServer SC45/GS Cluster and reached 7.727 Teraflops....

).
2010s Molecular Dynamics Simulation (Tianhe-1A)

See also

  • SC (conference)
    SC (conference)
    SC, previously an abbreviation for SuperComputing, is an annual conference mainly covering supercomputing. Started in 1988, it is sponsored by ACM SIGARCH and IEEE Computer Society. It is considered the largest conference in the area. It combines a trade show with a technical track and tutorials....

  • Supercomputing in China
    Supercomputing in China
    The People's Republic of China operates a number of supercomputer centers which hold world records in speed.The origins of these centers go back to 1989, when the State Planning Commission, the State Science and Technology Commission and the World Bank jointly launched a project to develop...

  • Supercomputing in India
    Supercomputing in India
    India's supercomputer program was started in late 1980s because Cray supercomputers were denied for import due to an arms embargo imposed on India, as it was a dual use technology and could be used for developing nuclear weapons....

  • Supercomputing in Europe
    Supercomputing in Europe
    Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing...

  • Supercomputing in Japan
    Supercomputing in Japan
    Japan operates a number of centers for supercomputing which hold world records in speed, with the K computer becoming the world's fastest in June 2011....

  • TOP500
    TOP500
    The TOP500 project ranks and details the 500 most powerful known computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year...

  • The Journal of Supercomputing
    The Journal of Supercomputing
    The Journal of Supercomputing is an academic computer science journal concerned with theoretical and practical aspects of supercomputing. Tutorial and survey papers are also included.- External links :* from SpringerLink...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK