Johnson–Lindenstrauss lemma
Encyclopedia
In mathematics, the Johnson–Lindenstrauss lemma is a result concerning low-distortion embedding
s of points from high-dimensional into low-dimensional Euclidean space
. The lemma states that a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz
, and can even be taken to be an orthogonal projection.
The lemma has uses in compressed sensing
, manifold learning, dimensionality reduction
, and graph embedding
. Much of the data stored and manipulated on computers, including text and images, can be represented as points in a high-dimensional space. However, the essential algorithms for working with such data tend to become bogged down very quickly as dimension increases. It is therefore desirable to reduce the dimensionality of the data in a way that preserves its relevant structure. The Johnson–Lindenstrauss lemma is a classic result in this vein.
Also the lemma is tight up to a factor log(1/ε), i.e. there exists a set of points of size m that needs dimension
in order to preserve the distances between all pair of points. See 4.
for all u, v ∈ X.
One proof of the lemma takes ƒ to be a suitable multiple of the orthogonal projection onto a random subspace of dimension n in RN, and exploits the phenomenon of concentration of measure
.
Embedding
In mathematics, an embedding is one instance of some mathematical structure contained within another instance, such as a group that is a subgroup....
s of points from high-dimensional into low-dimensional Euclidean space
Euclidean space
In mathematics, Euclidean space is the Euclidean plane and three-dimensional space of Euclidean geometry, as well as the generalizations of these notions to higher dimensions...
. The lemma states that a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz
Lipschitz continuity
In mathematical analysis, Lipschitz continuity, named after Rudolf Lipschitz, is a strong form of uniform continuity for functions. Intuitively, a Lipschitz continuous function is limited in how fast it can change: for every pair of points on the graph of this function, the absolute value of the...
, and can even be taken to be an orthogonal projection.
The lemma has uses in compressed sensing
Compressed sensing
Compressed sensing, also known as compressive sensing, compressive sampling and sparse sampling, is a technique for finding sparse solutions to underdetermined linear systems...
, manifold learning, dimensionality reduction
Dimensionality reduction
In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...
, and graph embedding
Graph embedding
In topological graph theory, an embedding of a graph G on a surface Σ is a representation of G on Σ in which points of Σ are associated to vertices and simple arcs are associated to edges in such a way that:...
. Much of the data stored and manipulated on computers, including text and images, can be represented as points in a high-dimensional space. However, the essential algorithms for working with such data tend to become bogged down very quickly as dimension increases. It is therefore desirable to reduce the dimensionality of the data in a way that preserves its relevant structure. The Johnson–Lindenstrauss lemma is a classic result in this vein.
Also the lemma is tight up to a factor log(1/ε), i.e. there exists a set of points of size m that needs dimension
in order to preserve the distances between all pair of points. See 4.
Lemma
Given 0 < ε < 1, a set X of m points in RN, and a number n > n 0 = O(ln(m) / ε 2), there is a Lipschitz function ƒ : RN → Rn such thatfor all u, v ∈ X.
One proof of the lemma takes ƒ to be a suitable multiple of the orthogonal projection onto a random subspace of dimension n in RN, and exploits the phenomenon of concentration of measure
Concentration of measure
In mathematics, concentration of measure is a principle that is applied in measure theory, probability and combinatorics, and has consequences for other fields such as Banach space theory. Informally, it states that "A random variable that depends in a Lipschitz way on many independent variables ...
.