Lloyd's algorithm
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 and electrical engineering
Electrical engineering
Electrical engineering is a field of engineering that generally deals with the study and application of electricity, electronics and electromagnetism. The field first became an identifiable occupation in the late nineteenth century after commercialization of the electric telegraph and electrical...

, Lloyd's algorithm, also known as Voronoi iteration or relaxation, is an algorithm for grouping data points into a given number of categories, used for k-means clustering.

Lloyd's algorithm is usually used in a Euclidean space
Euclidean space
In mathematics, Euclidean space is the Euclidean plane and three-dimensional space of Euclidean geometry, as well as the generalizations of these notions to higher dimensions...

, so the distance function serves as a measure of similarity between points, and averaging of each dimension for the averaging, but this need not be the case.

Lloyd's algorithm starts by partitioning the input points into k initial sets, either at random or using some heuristic
Heuristic
Heuristic refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a satisfactory solution, where an exhaustive search is impractical...

. It then calculates the average point, or centroid
Centroid
In geometry, the centroid, geometric center, or barycenter of a plane figure or two-dimensional shape X is the intersection of all straight lines that divide X into two parts of equal moment about the line. Informally, it is the "average" of all points of X...

, of each set via some metric (usually averaging dimensions in Euclidean space
Euclidean space
In mathematics, Euclidean space is the Euclidean plane and three-dimensional space of Euclidean geometry, as well as the generalizations of these notions to higher dimensions...

). It constructs a new partition by associating each point with the closest centroid, usually using the Euclidean distance
Euclidean distance
In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" distance between two points that one would measure with a ruler, and is given by the Pythagorean formula. By using this formula as distance, Euclidean space becomes a metric space...

 function. Then the centroids are recalculated for the new clusters, and algorithm repeated by alternate application of these two steps until convergence, which is obtained when the points no longer switch clusters (or alternatively centroids are no longer changed).

More formally:

Lloyd's algorithm starts with an initial distribution of samples or points and consists of repeatedly executing one relaxation step:
  • The Voronoi diagram
    Voronoi diagram
    In mathematics, a Voronoi diagram is a special kind of decomposition of a given space, e.g., a metric space, determined by distances to a specified family of objects in the space...

     of all the points is computed.
  • Each cell of the Voronoi diagram is integrated and the centroid is computed.
  • Each point is then moved to the centroid of its voronoi cell.


Each time a relaxation step is performed, the points are left in a slightly more even distribution: closely spaced points move further apart, and widely spaced points move closer together. In one dimension, this algorithm has been shown to converge to a centroidal Voronoi diagram, also named a centroidal Voronoi tessellation
Centroidal Voronoi tessellation
In geometry, a centroidal Voronoi tessellation is a special type of Voronoi tessellation or Voronoi diagrams. A Voronoi tessellation is called centroidal when the generating point of each Voronoi cell is also its mean . It can be viewed as an optimal partition corresponding to an optimal...

 . In higher dimensions, some slightly weaker convergence results are known , .

Since the algorithm converges slowly, and, due to limitations in numerical precision the algorithm will often not converge, real-world applications of Lloyd's algorithm stop once the distribution is "good enough." One common termination criterion is when the maximum distance a point moves in one iteration is below some set limit.

Lloyd's method is used in computer graphics because the resulting distribution has blue noise characteristics (see also Colors of noise
Colors of noise
While noise is by definition derived from a random signal, it can have different characteristic statistical properties corresponding to different mappings from a source of randomness to the concrete noise. Spectral density is such a property, which can be used to distinguish different types of noise...

), meaning there are few low-frequency components that could be interpreted as artifacts. It is particularly well-suited to picking sample positions for dithering.

Lloyd's algorithm is also used to generate dot drawings in the style of stippling
Stippling
Stippling is the creation of a pattern simulating varying degrees of solidity or shading by using small dots. Such a pattern may occur in nature and these effects are frequently emulated by artists.-Art:...

 . In this application, the centroids can be weighted based on a reference image to produce stipple illustrations matching an input image.

See also

  • The Linde–Buzo–Gray algorithm, which is a generalization of this algorithm.
  • k-means clustering
  • Mean-shift
    Mean-shift
    Mean shift is a non-parametric feature-space analysis technique, a so-called mode seeking algorithm. Application domains include clustering in computer vision and image processing.- Overview :...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK