Embarrassingly parallel
Encyclopedia
In parallel computing
, an embarrassingly parallel workload (or embarrassingly parallel problem) is one for which little or no effort is required to separate the problem into a number of parallel tasks. This is often the case where there exists no dependency (or communication) between those parallel tasks.
Embarrassingly parallel problems tend to require little or no communication of results between tasks, and are thus different from distributed computing
problems that require communication between tasks, especially communication of intermediate results. They are easy to perform on server farm
s which do not have any of the special infrastructure used in a true supercomputer
cluster. They are thus well suited to large, internet based distributed platforms such as BOINC.
A common example of an embarrassingly parallel problem lies within graphics processing unit
s (GPUs) for tasks such as 3D projection
, where each pixel on the screen may be rendered independently.
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
, an embarrassingly parallel workload (or embarrassingly parallel problem) is one for which little or no effort is required to separate the problem into a number of parallel tasks. This is often the case where there exists no dependency (or communication) between those parallel tasks.
Embarrassingly parallel problems tend to require little or no communication of results between tasks, and are thus different from distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
problems that require communication between tasks, especially communication of intermediate results. They are easy to perform on server farm
Server farm
A server farm or server cluster is a collection of computer servers usually maintained by an enterprise to accomplish server needs far beyond the capability of one machine. Server farms often have backup servers, which can take over the function of primary servers in the event of a primary server...
s which do not have any of the special infrastructure used in a true supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...
cluster. They are thus well suited to large, internet based distributed platforms such as BOINC.
A common example of an embarrassingly parallel problem lies within graphics processing unit
Graphics processing unit
A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...
s (GPUs) for tasks such as 3D projection
3D projection
3D projection is any method of mapping three-dimensional points to a two-dimensional plane. As most current methods for displaying graphical data are based on planar two-dimensional media, the use of this type of projection is widespread, especially in computer graphics, engineering and drafting.-...
, where each pixel on the screen may be rendered independently.
Examples
Some examples of embarrassingly parallel problems include:- Distributed relational database queries using distributed set processing
- Serving static files on a webserver.
- The Mandelbrot setMandelbrot setThe Mandelbrot set is a particular mathematical set of points, whose boundary generates a distinctive and easily recognisable two-dimensional fractal shape...
and other fractal calculations, where each point can be calculated independently. - RenderingRendering (computer graphics)Rendering is the process of generating an image from a model , by means of computer programs. A scene file contains objects in a strictly defined language or data structure; it would contain geometry, viewpoint, texture, lighting, and shading information as a description of the virtual scene...
of computer graphicsComputer graphicsComputer graphics are graphics created using computers and, more generally, the representation and manipulation of image data by a computer with help from specialized software and hardware....
. In ray tracing, each pixelPixelIn digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....
may be renderedRendering (computer graphics)Rendering is the process of generating an image from a model , by means of computer programs. A scene file contains objects in a strictly defined language or data structure; it would contain geometry, viewpoint, texture, lighting, and shading information as a description of the virtual scene...
independently. In computer animationComputer animationComputer animation is the process used for generating animated images by using computer graphics. The more general term computer generated imagery encompasses both static scenes and dynamic images, while computer animation only refers to moving images....
, each frame may be rendered independently (see parallel renderingParallel renderingParallel rendering is the application of parallel programming to the computational domain of computer graphics. Rendering graphics can require massive computational resources for complex scenes that arise in scientific visualization, medical visualization, CAD applications, and virtual reality...
). - Brute-force searchBrute-force searchIn computer science, brute-force search or exhaustive search, also known as generate and test, is a trivial but very general problem-solving technique that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem's...
es in cryptographyCryptographyCryptography is the practice and study of techniques for secure communication in the presence of third parties...
. A notable real-world example is distributed.netDistributed.netdistributed.net is a worldwide distributed computing effort that is attempting to solve large scale problems using otherwise idle CPU or GPU time. It is officially recognized as a non-profit organization under U.S...
. - BLASTBLASTIn bioinformatics, Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences...
searches in bioinformaticsBioinformaticsBioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...
. - Large scale face recognitionFacial recognition systemA facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame from a video source...
that involves comparing thousands of input faces with similarly large number of faces. - Computer simulations comparing many independent scenarios, such as climate models.
- Genetic algorithmGenetic algorithmA genetic algorithm is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems...
s and other evolutionary computationEvolutionary computationIn computer science, evolutionary computation is a subfield of artificial intelligence that involves combinatorial optimization problems....
metaheuristicMetaheuristicIn computer science, metaheuristic designates a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Metaheuristics make few or no assumptions about the problem being optimized and can search very large spaces...
s. - Ensemble calculationsStatistical ensemble (mathematical physics)In mathematical physics, especially as introduced into statistical mechanics and thermodynamics by J. Willard Gibbs in 1878, an ensemble is an idealization consisting of a large number of mental copies of a system, considered all at once, each of which represents a possible state that the real...
of numerical weather predictionNumerical weather predictionNumerical weather prediction uses mathematical models of the atmosphere and oceans to predict the weather based on current weather conditions. Though first attempted in the 1920s, it was not until the advent of computer simulation in the 1950s that numerical weather predictions produced realistic...
. - Event simulation and reconstruction in particle physicsParticle physicsParticle physics is a branch of physics that studies the existence and interactions of particles that are the constituents of what is usually referred to as matter or radiation. In current understanding, particles are excitations of quantum fields and interact following their dynamics...
. - Sieving step of the quadratic sieveQuadratic sieveThe quadratic sieve algorithm is a modern integer factorization algorithm and, in practice, the second fastest method known . It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve...
and the number field sieve.
Implementations
- In R (programming language)R (programming language)R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....
- The snow (Simple Network of Workstations) package implements a simple mechanism for using a collection of workstations or a Beowulf cluster for embarrassingly parallel computations.
See also
- Amdahl's lawAmdahl's lawAmdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved...
- an embarrassingly parallel problem would have P almost exactly equal to 1. - MapReduceMapReduceMapReduce is a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers. Parts of the framework are patented in some countries....
External links
- Embarrassingly parallel, Parallel algorithms
- Embarrassingly Parallel Computations, Engineering a Beowulf-style Compute Cluster
- "Star-P: High Productivity Parallel Computing"