Particle swarm optimization - AbsoluteAstronomy.com

Computer science

Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively

Iterative method

In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination criteria, is an algorithm of the iterative method...

trying to improve a candidate solution

Candidate solution

In optimization , a candidate solution is a member of a set of possible solutions to a given problem. A candidate solution does not have to be a likely or reasonable solution to the problem – it is simply in the set that satisfies all constraints.The space of all candidate solutions is called the...

with regard to a given measure of quality. PSO optimizes a problem by having a population of candidate solutions, here dubbed particle

Point particle

A point particle is an idealization of particles heavily used in physics. Its defining feature is that it lacks spatial extension: being zero-dimensional, it does not take up space...

s, and moving these particles around in the search-space according to simple mathematical formulae

Formula

In mathematics, a formula is an entity constructed using the symbols and formation rules of a given logical language....

over the particle's position and velocity

Velocity

In physics, velocity is speed in a given direction. Speed describes only how fast an object is moving, whereas velocity gives both the speed and direction of the object's motion. To have a constant velocity, an object must have a constant speed and motion in a constant direction. Constant ...

. Each particle's movement is influenced by its local best known position and is also guided toward the best known positions in the search-space, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions.

PSO is originally attributed to Kennedy

James Kennedy (social psychologist)

James Kennedy is an American social psychologist, best known as an originator and researcher of particle swarm optimization. The first papers on the topic, by Kennedy and Russell C. Eberhart, were presented in 1995; since then more than ten thousand papers have been published on particle swarms...

, Eberhart

Russell C. Eberhart

Russell C. Eberhart, an American electrical engineer, best known as the co-developer of particle swarm optimization concept . He is professor of Electrical and Computer Engineering, and adjunct professor of Biomedical Engineering at the Purdue School of Engineering and Technology, Indiana...

and Shi and was first intended for simulating

Computer simulation

A computer simulation, a computer model, or a computational model is a computer program, or network of computers, that attempts to simulate an abstract model of a particular system...

social behaviour, as a stylized representation of the movement of organisms in a bird flock

Flocking (behavior)

Flocking behavior is the behavior exhibited when a group of birds, called a flock, are foraging or in flight. There are parallels with the shoaling behavior of fish, the swarming behavior of insects, and herd behavior of land animals....

or fish school. The algorithm was simplified and it was observed to be performing optimization. The book by Kennedy and Eberhart describes many philosophical aspects of PSO and swarm intelligence

Swarm intelligence

Swarm intelligence is the collective behaviour of decentralized, self-organized systems, natural or artificial. The concept is employed in work on artificial intelligence...

. An extensive survey of PSO applications is made by Poli.

PSO is a metaheuristic

Metaheuristic

In computer science, metaheuristic designates a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Metaheuristics make few or no assumptions about the problem being optimized and can search very large spaces...

as it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. However, metaheuristics such as PSO do not guarantee an optimal solution is ever found. More specifically, PSO does not use the gradient

Gradient

In vector calculus, the gradient of a scalar field is a vector field that points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change....

of the problem being optimized, which means PSO does not require that the optimization problem be differentiable as is required by classic optimization methods such as gradient descent

Gradient descent

Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point...

and quasi-newton methods. PSO can therefore also be used on optimization problems that are partially irregular, noisy, change over time, etc.

Algorithm

A basic variant of the PSO algorithm works by having a population (called a swarm) of candidate solution

Candidate solution

s (called particles). These particles are moved around in the search-space according to a few simple formulae. The movements of the particles are guided by their own best known position in the search-space as well as the entire swarm's best known position. When improved positions are being discovered these will then come to guide the movements of the swarm. The process is repeated and by doing so it is hoped, but not guaranteed, that a satisfactory solution will eventually be discovered.

Formally, let f: ⁿ → be the fitness or cost function which must be minimized. The function takes a candidate solution as argument in the form of a vector of real number

Real number

In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...

s and produces a real number as output which indicates the fitness of the given candidate solution. The gradient

Gradient

of f is not known. The goal is to find a solution a for which f(a) ≤ f(b) for all b in the search-space, which would mean a is the global minimum. Maximization can be performed by considering the function h = -f instead.

Let S be the number of particles in the swarm, each having a position x_i ∈ ⁿ in the search-space and a velocity v_i ∈ ⁿ. Let p_i be the best known position of particle i and let g be the best known position of the entire swarm. A basic PSO algorithm is then:

For each particle i = 1, ..., S do:
- Initialize the particle's position with a uniformly distributed
  Uniform distribution (continuous)
  In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by...
  
  random vector: x_i ~ U(b_lo, b_up), where b_lo and b_up are the lower and upper boundaries of the search-space.
- Initialize the particle's best known position to its initial position: p_i ← x_i
- If (f(p_i) < f(g)) update the swarm's best known position: g ← p_i
- Initialize the particle's velocity: v_i ~ U(-|b_up-b_lo|, |b_up-b_lo|)
Until a termination criterion is met (e.g. number of iterations performed, or adequate fitness reached), repeat:
- For each particle i = 1, ..., S do:
  - Pick random numbers: r_p, r_g ~ U(0,1)
  - Update the particle's velocity: v_i ← ω v_i + φ_p r_p (p_i-x_i) + φ_g r_g (g-x_i)
  - Update the particle's position: x_i ← x_i + v_i
  - If (f(x_i) < f(p_i)) do:
    - Update the particle's best known position: p_i ← x_i
    - If (f(p_i) < f(g)) update the swarm's best known position: g ← p_i
Now g holds the best found solution.

The parameters ω, φ_p, and φ_g are selected by the practitioner and control the behaviour and efficacy of the PSO method, see below.

Parameter selection

The choice of PSO parameters can have a large impact on optimization performance. Selecting PSO parameters that yield good performance has therefore been the subject of much research.

The PSO parameters can also be tuned by using another overlaying optimizer, a concept known as meta-optimization

Meta-optimization

In numerical optimization, meta-optimization is the use of one optimization method to tune another optimization method. Meta-optimization is reported to have been used as early as in the late 1970s by Mercer and Sampson for finding optimal parameter settings of a genetic algorithm...

. Parameters have also been tuned for various optimization scenarios.

Inner workings

There are several schools of thought as to why and how the PSO algorithm can perform optimization.

A common belief amongst researchers is that the swarm behaviour varies between exploratory behaviour, that is, searching a broader region of the search-space, and exploitative behaviour, that is, a locally oriented search so as to get closer to a (possibly local) optimum. This school of thought has been prevalent since the inception of PSO. This school of thought contends that the PSO algorithm and its parameters must be chosen so as to properly balance between exploration and exploitation to avoid premature convergence

Premature convergence

In genetic algorithms, the term of premature convergence means that a population for an optimization problem converged too early, resulting in being suboptimal. In this context, the parental solutions, through the aid of genetic operators, are not able to generate offsprings that are superior to...

to a local optimum

Local optimum

Local optimum is a term in applied mathematics and computer science.A local optimum of a combinatorial optimization problem is a solution that is optimal within a neighboring set of solutions...

yet still ensure a good rate of convergence to the optimum. This belief is the precursor of many PSO variants, see below.

Another school of thought is that the behaviour of a PSO swarm is not well understood in terms of how it affects actual optimization performance, especially for higher dimensional search-spaces and optimization problems that may be discontinuous, noisy, and time-varying. This school of thought merely tries to find PSO algorithms and parameters that cause good performance regardless of how the swarm behaviour can be interpreted in relation to e.g. exploration and exploitation. Such studies have led to the simplification of the PSO algorithm, see below.

Convergence

In relation to PSO the word convergence typically means one of two things, although it is often not clarified which definition is meant and sometimes they are mistakenly thought to be identical.

Convergence may refer to the swarm's best known position g approaching (converging to) the optimum of the problem, regardless of how the swarm behaves.
Convergence may refer to a swarm collapse
Convergence (evolutionary computing)
Precisely every individual in the population is identical. While full convergence might be seen in genetic algorithms using only cross over, such convergence is seldom seen in genetic programming using Koza's subtree swapping crossover...

in which all particles have converged to a point in the search-space, which may or may not be the optimum.

Several attempts at mathematically analyzing PSO convergence exist in the literature. These analyses have resulted in guidelines for selecting PSO parameters that are believed to cause convergence, divergence or oscillation

Oscillation

Oscillation is the repetitive variation, typically in time, of some measure about a central value or between two or more different states. Familiar examples include a swinging pendulum and AC power. The term vibration is sometimes used more narrowly to mean a mechanical oscillation but sometimes...

of the swarm's particles, and the analyses have also given rise to several PSO variants. However, the analyses were criticized by Pedersen for being oversimplified as they assume the swarm has only one particle, that it does not use stochastic variables and that the points of attraction, that is, the particle's best known position p and the swarm's best known position g, remain constant throughout the optimization process. Furthermore, some analyses allow for an infinite number of optimization iterations which is not possible in reality. This means that determining convergence capabilities of different PSO algorithms and parameters therefore still depends on empirical

Empirical

The word empirical denotes information gained by means of observation or experimentation. Empirical data are data produced by an experiment or observation....

results.

Variants

Numerous variants of even a basic PSO algorithm are possible. For example, there are different ways to initialize the particles and velocities (e.g. start with zero velocities instead), how to dampen the velocity, pick random numbers r_p and r_g for each dimension when updating the velocity-vector rather than using the same numbers for the entire vector, only update p_i and g after the entire swarm has been updated, use a subset of the swarm (known as a neighbourhood) instead of g when updating a particle's velocity, etc. Some of these choices and their possible performance impact have been discussed in the literature.

New and more sophisticated PSO variants are also continually being introduced in an attempt to improve optimization performance. There are certain trends in that research; one is to make a hybrid optimization method using PSO combined with other optimizers, another research trend is to try and alleviate premature convergence (that is, optimization stagnation) e.g. by reversing or perturbing the movement of the PSO particles, and then there are also attempts at adapting the behavioural parameters of PSO during optimization.

Simplifications

Another school of thought is that PSO should be simplified as much as possible without impairing its performance; a general concept often referred to as Occam's razor

Occam's razor

Occam's razor, also known as Ockham's razor, and sometimes expressed in Latin as lex parsimoniae , is a principle that generally recommends from among competing hypotheses selecting the one that makes the fewest new assumptions.-Overview:The principle is often summarized as "simpler explanations...

. Simplifying PSO was originally suggested by Kennedy and has been studied more extensively, where it appeared that optimization performance was improved, and the parameters were easier to tune and they performed more consistently across different optimization problems.

Another argument in favour of simplifying PSO is that metaheuristic

Metaheuristic

s can only have their efficacy demonstrated empirical

Empirical

The word empirical denotes information gained by means of observation or experimentation. Empirical data are data produced by an experiment or observation....

ly by doing computational experiments on a finite number of optimization problems. This means a metaheuristic such as PSO cannot be proven correct and this increases the risk of making errors in its description and implementation. A good example of this presented a promising variant of a genetic algorithm

Genetic algorithm

A genetic algorithm is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems...

(another popular metaheuristic) but it was later found to be defective as it was strongly biased in its optimization search towards similar values for different dimensions in the search space, which happened to be the optimum of the benchmark problems considered. This bias was because of a programming error, and has now been fixed.

Multi-objective optimization

PSO has also been applied to multi-objective problems, in which the fitness comparison takes pareto dominance

Pareto efficiency

Pareto efficiency, or Pareto optimality, is a concept in economics with applications in engineering and social sciences. The term is named after Vilfredo Pareto, an Italian economist who used the concept in his studies of economic efficiency and income distribution.Given an initial allocation of...

into account when moving the PSO particles and non-dominated solutions are stored so as to approximate the pareto front.

External links

Particle Swarm Central is a repository for information on PSO.
A brief video of particle swarms optimizing three benchmark functions.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.