Empirical orthogonal functions
Encyclopedia
In statistics
and signal processing
, the method of empirical orthogonal function (EOF) analysis is a decomposition of a signal
or data set in terms of orthogonal basis function
s which are determined from the data. It is the same as performing a principal components analysis
on the data, except that the EOF method finds both time series
and spatial patterns. The term is also interchangeable with the geographically weighted PCAs
in geophysics
.
The ith basis function is chosen to be orthogonal to the basis functions from the first through i − 1, and to minimize the residual variance
. That is, the basis functions are chosen to be different from each other, and to account for as much variance as possible.
Thus this method has much in common with the method of kriging
in geostatistics
and Gaussian process
models.
The method of EOF is similar in spirit to harmonic analysis
, but harmonic analysis typically uses predetermined orthogonal functions, for example, sine and cosine functions at fixed frequencies
. In some cases the two methods may yield essentially the same results.
The basis functions are typically found by computing the eigenvectors of the covariance matrix
of the data set. A more advanced technique is to form a kernel (matrix) out of the data, using a fixed kernel
. The basis functions from the eigenvectors of the kernel matrix are thus non-linear in the location of the data (see Mercer's theorem
and the kernel trick
for more information).
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
and signal processing
Signal processing
Signal processing is an area of systems engineering, electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time...
, the method of empirical orthogonal function (EOF) analysis is a decomposition of a signal
Signal processing
Signal processing is an area of systems engineering, electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time...
or data set in terms of orthogonal basis function
Basis function
In mathematics, a basis function is an element of a particular basis for a function space. Every continuous function in the function space can be represented as a linear combination of basis functions, just as every vector in a vector space can be represented as a linear combination of basis...
s which are determined from the data. It is the same as performing a principal components analysis
Principal components analysis
Principal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...
on the data, except that the EOF method finds both time series
Time series
In statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones index or the annual flow volume of the...
and spatial patterns. The term is also interchangeable with the geographically weighted PCAs
Principal components analysis
Principal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...
in geophysics
Geophysics
Geophysics is the physics of the Earth and its environment in space; also the study of the Earth using quantitative physical methods. The term geophysics sometimes refers only to the geological applications: Earth's shape; its gravitational and magnetic fields; its internal structure and...
.
The ith basis function is chosen to be orthogonal to the basis functions from the first through i − 1, and to minimize the residual variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
. That is, the basis functions are chosen to be different from each other, and to account for as much variance as possible.
Thus this method has much in common with the method of kriging
Kriging
Kriging is a group of geostatistical techniques to interpolate the value of a random field at an unobserved location from observations of its value at nearby locations....
in geostatistics
Geostatistics
Geostatistics is a branch of statistics focusing on spatial or spatiotemporal datasets. Developed originally to predict probability distributions of ore grades for mining operations, it is currently applied in diverse disciplines including petroleum geology, hydrogeology, hydrology, meteorology,...
and Gaussian process
Gaussian process
In probability theory and statistics, a Gaussian process is a stochastic process whose realisations consist of random values associated with every point in a range of times such that each such random variable has a normal distribution...
models.
The method of EOF is similar in spirit to harmonic analysis
Harmonic analysis
Harmonic analysis is the branch of mathematics that studies the representation of functions or signals as the superposition of basic waves. It investigates and generalizes the notions of Fourier series and Fourier transforms...
, but harmonic analysis typically uses predetermined orthogonal functions, for example, sine and cosine functions at fixed frequencies
Frequency
Frequency is the number of occurrences of a repeating event per unit time. It is also referred to as temporal frequency.The period is the duration of one cycle in a repeating event, so the period is the reciprocal of the frequency...
. In some cases the two methods may yield essentially the same results.
The basis functions are typically found by computing the eigenvectors of the covariance matrix
Covariance matrix
In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
of the data set. A more advanced technique is to form a kernel (matrix) out of the data, using a fixed kernel
Kernel (mathematics)
In mathematics, the word kernel has several meanings. Kernel may mean a subset associated with a mapping:* The kernel of a mapping is the set of elements that map to the zero element , as in kernel of a linear operator and kernel of a matrix...
. The basis functions from the eigenvectors of the kernel matrix are thus non-linear in the location of the data (see Mercer's theorem
Mercer's theorem
In mathematics, specifically functional analysis, Mercer's theorem is a representation of a symmetric positive-definite function on a square as a sum of a convergent sequence of product functions. This theorem, presented in , is one of the most notable results of the work of James Mercer...
and the kernel trick
Kernel trick
For machine learning algorithms, the kernel trick is a way of mapping observations from a general set S into an inner product space V , without ever having to compute the mapping explicitly, in the hope that the observations will gain meaningful linear structure in V...
for more information).
See also
- Blind signal separationBlind signal separationBlind signal separation, also known as blind source separation, is the separation of a set of signals from a set of mixed signals, without the aid of information about the source signals or the mixing process....
- Nonlinear dimensionality reductionNonlinear dimensionality reductionHigh-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lies on an embedded non-linear manifold within the higher-dimensional space...
- Orthogonal matrixOrthogonal matrixIn linear algebra, an orthogonal matrix , is a square matrix with real entries whose columns and rows are orthogonal unit vectors ....
- Source separationSource separationSource separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. The classical example is the "cocktail party problem", where a number of people are talking simultaneously in a room ,...
- Transform codingTransform codingTransform coding is a type of data compression for "natural" data like audio signals or photographic images. The transformation is typically lossy, resulting in a lower quality copy of the original input....
- Varimax rotationVarimax rotationIn statistics, a varimax rotation is a change of coordinates used in principal component analysis and factor analysis that maximizes the sum of the variances of the squared loadings...