Multilinear subspace learning
Encyclopedia
Multilinear subspace learning (MSL) aims to learn a specific small part of a large space of multidimensional objects having a particular desired property. It is a dimensionality reduction approach for finding a low-dimensional representation with certain preferred characteristics of high-dimensional tensor
data through direct mapping
, without going through vectorization
. The term tensor in MSL refers to multidimensional arrays. Examples of tensor data include image
s (2D/3D), video
sequences (3D/4D), and hyperspectral cubes
(3D/4D). The mapping from a high-dimensional tensor space to a low-dimensional tensor space or vector space
is named as multilinear projection.
MSL methods are higher-order generalizations of linear subspace
learning methods such as principal component analysis (PCA) and linear discriminant analysis
(LDA). In the literature, MSL is also referred to as tensor subspace learning or tensor subspace analysis. Research on MSL has progressed from heuristic exploration in 2000s to systematic investigation in 2010s.
and storage technology, massive multidimensional data are being generated on a daily basis in a wide range of emerging applications. These massive, multidimensional data are usually very high-dimensional, with a large amount of redundancy, and only occupying a part of the input space. Therefore, dimensionality reduction is frequently employed to map high-dimensional data to a low-dimensional space while retaining as much information as possible.
Linear subspace
learning algorithms are traditional dimensionality reduction techniques that represent input data as vectors and solve for an optimal linear mapping to a lower dimensional space. Unfortunately, they often become inadequate when dealing with massive multidimensional data. They result in very high-dimensional vectors, lead to the estimation of a large number of parameters, and also break the natural structure and correlation in the original data.
MSL is closely related to tensor decompositions. They both employ multilinear algebra
tools. The difference is that tensor decomposition focuses on factor analysis
, while MSL focuses on dimensionality reduction
.
in 1960s.
, also known as the parallel factors (PARAFAC) decomposition .
This is originated from the alternating least square method for multi-way data analysis .
The disadvantages of MSL are:
Tensor
Tensors are geometric objects that describe linear relations between vectors, scalars, and other tensors. Elementary examples include the dot product, the cross product, and linear maps. Vectors and scalars themselves are also tensors. A tensor can be represented as a multi-dimensional array of...
data through direct mapping
Map (mathematics)
In most of mathematics and in some related technical fields, the term mapping, usually shortened to map, is either a synonym for function, or denotes a particular kind of function which is important in that branch, or denotes something conceptually similar to a function.In graph theory, a map is a...
, without going through vectorization
Vectorization (mathematics)
In mathematics, especially in linear algebra and matrix theory, the vectorization of a matrix is a linear transformation which converts the matrix into a column vector...
. The term tensor in MSL refers to multidimensional arrays. Examples of tensor data include image
Image
An image is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person.-Characteristics:...
s (2D/3D), video
Video
Video is the technology of electronically capturing, recording, processing, storing, transmitting, and reconstructing a sequence of still images representing scenes in motion.- History :...
sequences (3D/4D), and hyperspectral cubes
Hyperspectral imaging
Hyperspectral imaging collects and processes information from across the electromagnetic spectrum. Much as the human eye sees visible light in three bands , spectral imaging divides the spectrum into many more bands...
(3D/4D). The mapping from a high-dimensional tensor space to a low-dimensional tensor space or vector space
Vector space
A vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex...
is named as multilinear projection.
MSL methods are higher-order generalizations of linear subspace
Linear subspace
The concept of a linear subspace is important in linear algebra and related fields of mathematics.A linear subspace is usually called simply a subspace when the context serves to distinguish it from other kinds of subspaces....
learning methods such as principal component analysis (PCA) and linear discriminant analysis
Linear discriminant analysis
Linear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...
(LDA). In the literature, MSL is also referred to as tensor subspace learning or tensor subspace analysis. Research on MSL has progressed from heuristic exploration in 2000s to systematic investigation in 2010s.
Background
With the advances in data acquisitionData acquisition
Data acquisition is the process of sampling signals that measure real world physical conditions and converting the resulting samples into digital numeric values that can be manipulated by a computer. Data acquisition systems typically convert analog waveforms into digital values for processing...
and storage technology, massive multidimensional data are being generated on a daily basis in a wide range of emerging applications. These massive, multidimensional data are usually very high-dimensional, with a large amount of redundancy, and only occupying a part of the input space. Therefore, dimensionality reduction is frequently employed to map high-dimensional data to a low-dimensional space while retaining as much information as possible.
Linear subspace
Linear subspace
The concept of a linear subspace is important in linear algebra and related fields of mathematics.A linear subspace is usually called simply a subspace when the context serves to distinguish it from other kinds of subspaces....
learning algorithms are traditional dimensionality reduction techniques that represent input data as vectors and solve for an optimal linear mapping to a lower dimensional space. Unfortunately, they often become inadequate when dealing with massive multidimensional data. They result in very high-dimensional vectors, lead to the estimation of a large number of parameters, and also break the natural structure and correlation in the original data.
MSL is closely related to tensor decompositions. They both employ multilinear algebra
Multilinear algebra
In mathematics, multilinear algebra extends the methods of linear algebra. Just as linear algebra is built on the concept of a vector and develops the theory of vector spaces, multilinear algebra builds on the concepts of p-vectors and multivectors with Grassmann algebra.-Origin:In a vector space...
tools. The difference is that tensor decomposition focuses on factor analysis
Factor analysis
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved, uncorrelated variables called factors. In other words, it is possible, for example, that variations in three or four observed variables...
, while MSL focuses on dimensionality reduction
Dimensionality reduction
In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...
.
Multilinear projection
A multilinear subspace is defined through a multilinear projection that maps the input tensor data from one space to another (lower-dimensional) space. The original idea is due to Hitchcock in 1927.Tensor-to-tensor projection (TTP)
A TTP is a direct projection of a high-dimensional tensor to a low-dimensional tensor of the same order, using N projection matrices for an Nth-order tensor. It can be performed in N steps with each step performing a tensor-matrix multiplication (product). The N steps are exchangeable. This projection is an extension of the higher-order singular value decomposition (HOSVD) to subspace learning. Hence, its origin is traced back to the Tucker decompositionTucker decomposition
In mathematics, Tucker decomposition decomposes a tensor into a set of matrices and one small core tensor. It is named after Ledyard R. Tuckeralthough it goes back to Hitchcock in 1927....
in 1960s.
Tensor-to-vector projection (TVP)
A TVP is a direct projection of a high-dimensional tensor to a low-dimensional vector, which is also referred to as the rank-one projections. As TVP projects a tensor to a vector, it can be viewed as multiple projections from a tensor to a scalar. Thus, the TVP of a tensor to a P-dimensional vector consists of P projections from the tensor to a scalar. The projection from a tensor to a scalar is an elementary multilinear projection (EMP). In EMP, a tensor is projected to a point through N unit projection vectors. It is the projection of a tensor on a single line (resulting a scalar), with one projection vector in each mode. Thus, the TVP of a tensor object to a vector in a P-dimensional vector space consists of P EMPs. This projection is an extension of the canonical decompositionCP decomposition
In multilinear algebra, the canonical polyadic decomposition , historically known as PARAFAC and later CANDECOMP, is a generalization of the matrix singular value decomposition to tensors, with many applications in in statistics, signal processing, psychometrics, linguistics and chemometrics...
, also known as the parallel factors (PARAFAC) decomposition .
Typical approach in MSL
There are N sets of parameters to be solved, one in each mode. The solution to one set often depends on the other sets (except when N=1, the linear case). Therefore, the suboptimal iterative procedure in is followed.- Initialization of the projections in each mode
- For each mode, fixing the projection in all the other mode, and solve for the projection in the current mode.
- Do the mode-wise optimization for a few iterations or until convergence.
This is originated from the alternating least square method for multi-way data analysis .
Pros and cons
The advantages of MSL are:- It operates on natural tensorial representation of multidimensional data so the structure and correlation in the original data are preserved.
- The number of parameters to be estimated is much smaller than in the linear case.
- It has fewer problems in the small sample size scenario.
The disadvantages of MSL are:
- Most MSL algorithm are iterative. They may be affected by initialization method and have convergence problem.
- The solution obtained is local optimumLocal optimumLocal optimum is a term in applied mathematics and computer science.A local optimum of a combinatorial optimization problem is a solution that is optimal within a neighboring set of solutions...
.
Algorithms
- Multilinear extension of PCA
- Multilinear Principal Component AnalysisMultilinear principal component analysisMultilinear principal-component analysis is a mathematical procedure that uses multiple orthogonal transformations to convert a set of multidimensional objects into another set of multidimensional objects of lower dimensions. There is one orthogonal transformation for each dimension...
(MPCA) - Uncorrelated Multilinear Principal Component Analysis (UMPCA)
- Multilinear Principal Component Analysis
- Multilinear extension of LDALinear discriminant analysisLinear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...
- Discriminant Analysis with Tensor Representation (DATER)
- General tensor discriminant analysis (GTDA)
- Uncorrelated Multilinear Discriminant Analysis (UMLDA)
Pedagogical resources
- Survey: A survey of multilinear subspace learning for tensor data (open access version).
- Lecture: Video lecture on UMPCA at the 25th International Conference on Machine Learning (ICML 2008).
Code
- MATLAB Tensor Toolbox by Sandia National LaboratoriesSandia National LaboratoriesThe Sandia National Laboratories, managed and operated by the Sandia Corporation , are two major United States Department of Energy research and development national laboratories....
. - The MPCA algorithm written in Matlab (MPCA+LDA included).
Tensor data sets
- 3D gait data (third-order tensors): 128x88x20(21.2M); 64x44x20(9.9M); 32x22x10(3.2M);