Quadratic classifier
Encyclopedia
A quadratic classifier is used in machine learning
and statistical classification to separate measurements of two or more classes of objects or events by a quadric
surface. It is a more general version of the linear classifier
.
of observations x of an object or event, each of which has a known type y. This set is referred to as the training set
. The problem is then to determine for a given new observation vector, what the best class should be. For a quadratic classifier, the correct solution is assumed to be quadratic in the measurements, so y will be decided based on
In the special case where each observation consists of two measurements, this means that the surfaces separating the classes will be conic sections (i.e. either a line
, a circle
or ellipse
, a parabola
or a hyperbola
).
(LDA), where it is assumed that there are only two classes of points (so ), and that the measurements are normally distributed. Unlike LDA however, in QDA there is no assumption that the covariance
of each of the classes is identical. When the assumption is true, the best possible test for the hypothesis that a given measurement is from a given class is the likelihood ratio test
. Suppose the means of each class are known to be and the covariances . Then the likelihood ratio will be given by
for some threshold t. After some rearrangement, it can be shown that the resulting separating surface between the classes is a quadratic.
individual measurements. For instance, the vector
would become
.
Finding a quadratic classifier for the original measurements would then become the same as finding a linear classifier based on the expanded measurement vector. For linear classifiers based only on dot product
s, these expanded measurements do not have to be actually computed, since the dot product in the higher dimensional space is simply related to that in the original space. This is an example of the so-called kernel trick
, which can be applied to linear discriminant analysis, as well as the support vector machine
.
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...
and statistical classification to separate measurements of two or more classes of objects or events by a quadric
Quadric
In mathematics, a quadric, or quadric surface, is any D-dimensional hypersurface in -dimensional space defined as the locus of zeros of a quadratic polynomial...
surface. It is a more general version of the linear classifier
Linear classifier
In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics...
.
The classification problem
Statistical classification considers a set of vectorsProbability vector
Stochastic vector redirects here. For the concept of a random vector, see Multivariate random variable.In mathematics and statistics, a probability vector or stochastic vector is a vector with non-negative entries that add up to one....
of observations x of an object or event, each of which has a known type y. This set is referred to as the training set
Training set
A training set is a set of data used in various areas of information science to discover potentially predictive relationships. Training sets are used in artificial intelligence, machine learning, genetic programming, intelligent systems, and statistics...
. The problem is then to determine for a given new observation vector, what the best class should be. For a quadratic classifier, the correct solution is assumed to be quadratic in the measurements, so y will be decided based on
In the special case where each observation consists of two measurements, this means that the surfaces separating the classes will be conic sections (i.e. either a line
Line (geometry)
The notion of line or straight line was introduced by the ancient mathematicians to represent straight objects with negligible width and depth. Lines are an idealization of such objects...
, a circle
Circle
A circle is a simple shape of Euclidean geometry consisting of those points in a plane that are a given distance from a given point, the centre. The distance between any of the points and the centre is called the radius....
or ellipse
Ellipse
In geometry, an ellipse is a plane curve that results from the intersection of a cone by a plane in a way that produces a closed curve. Circles are special cases of ellipses, obtained when the cutting plane is orthogonal to the cone's axis...
, a parabola
Parabola
In mathematics, the parabola is a conic section, the intersection of a right circular conical surface and a plane parallel to a generating straight line of that surface...
or a hyperbola
Hyperbola
In mathematics a hyperbola is a curve, specifically a smooth curve that lies in a plane, which can be defined either by its geometric properties or by the kinds of equations for which it is the solution set. A hyperbola has two pieces, called connected components or branches, which are mirror...
).
Quadratic discriminant analysis
Quadratic discriminant analysis (QDA) is closely related to linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis and the related Fisher's linear discriminant are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events...
(LDA), where it is assumed that there are only two classes of points (so ), and that the measurements are normally distributed. Unlike LDA however, in QDA there is no assumption that the covariance
Covariance
In probability theory and statistics, covariance is a measure of how much two variables change together. Variance is a special case of the covariance when the two variables are identical.- Definition :...
of each of the classes is identical. When the assumption is true, the best possible test for the hypothesis that a given measurement is from a given class is the likelihood ratio test
Likelihood-ratio test
In statistics, a likelihood ratio test is a statistical test used to compare the fit of two models, one of which is a special case of the other . The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other...
. Suppose the means of each class are known to be and the covariances . Then the likelihood ratio will be given by
- Likelihood ratio =
for some threshold t. After some rearrangement, it can be shown that the resulting separating surface between the classes is a quadratic.
Other quadratic classifiers
While QDA is the most commonly used method for obtaining a classifier, other methods are also possible. One such method is to create a longer measurement vector from the old one by adding all pairwise products ofindividual measurements. For instance, the vector
would become
.
Finding a quadratic classifier for the original measurements would then become the same as finding a linear classifier based on the expanded measurement vector. For linear classifiers based only on dot product
Dot product
In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers and returns a single number obtained by multiplying corresponding entries and then summing those products...
s, these expanded measurements do not have to be actually computed, since the dot product in the higher dimensional space is simply related to that in the original space. This is an example of the so-called kernel trick
Kernel trick
For machine learning algorithms, the kernel trick is a way of mapping observations from a general set S into an inner product space V , without ever having to compute the mapping explicitly, in the hope that the observations will gain meaningful linear structure in V...
, which can be applied to linear discriminant analysis, as well as the support vector machine
Support vector machine
A support vector machine is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis...
.