Sparse PCA
Encyclopedia
Sparse PCA is a specialised technique used in statistical analysis and, in particular, in the analysis of multivariate data set
Data set
A data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. Its values for each of the variables, such as height and weight of an object or values of random numbers. Each...

s.

Ordinary principal component analysis (PCA) uses a vector space
Vector space
A vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex...

 transform used to reduce multidimensional data sets to lower dimensions for analysis. It finds linear combination
Linear combination
In mathematics, a linear combination is an expression constructed from a set of terms by multiplying each term by a constant and adding the results...

s of variables (called "principal components") that correspond to directions of maximal variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...

 in the data. The number of new variables created by these linear combinations is usually much lower than the number of variables in the original dataset. Sparse PCA finds sets of sparse vector
Sparse vector
A sparse vector is a vector whose elements are mostly zeros. Mathematically, a sparse vector has a low Zero norm....

s for use as weights in the linear combinations while still explaining most of the variance present in the data.

Several approaches have been proposed, including a regression framework, a convex relaxation/semidefinite programming framework,, a generalized power method framework forward/backward greedy search and exact methods using branch-and-bound techniques, Bayesian formulation framework.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK