Feature extraction
Encyclopedia
In pattern recognition
and in image processing
, feature extraction is a special form of dimensionality reduction
.
When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (much data, but not much information) then the input data will be transformed into a reduced representation set of features (also named features vector). Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.
the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy.
Best results are achieved when an expert constructs a set of application-dependent features. Nevertheless, if no such expert knowledge is available general dimensionality reduction techniques may help. These include:
which involves using algorithm
s to detect and isolate various desired portions or shapes (features) of a digitized image or video stream. It is particularly important in the area of optical character recognition
, SciLab
, NumPy and the R language
provide some of the simpler feature extraction techniques (e.g. principal component analysis) via built-in commands. More specific algorithms are often available as publicly-available scripts or third-party add-ons.
Pattern recognition
In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...
and in image processing
Image processing
In electrical engineering and computer science, image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image...
, feature extraction is a special form of dimensionality reduction
Dimensionality reduction
In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...
.
When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (much data, but not much information) then the input data will be transformed into a reduced representation set of features (also named features vector). Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.
General
Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. When performing analysis of complex data one of the major problems stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which overfitsOverfitting
In statistics, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations...
the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy.
Best results are achieved when an expert constructs a set of application-dependent features. Nevertheless, if no such expert knowledge is available general dimensionality reduction techniques may help. These include:
- Principal components analysisPrincipal components analysisPrincipal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...
- Semidefinite embeddingSemidefinite embeddingSemidefinite embedding or maximum variance unfolding is an algorithm in computer science, that uses semidefinite programming to perform non-linear dimensionality reduction of high-dimensional vectorial input data....
- Multifactor dimensionality reductionMultifactor dimensionality reductionMultifactor dimensionality reduction is a data mining approach for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable...
- Multilinear subspace learningMultilinear subspace learningMultilinear subspace learning aims to learn a specific small part of a large space of multidimensional objects having a particular desired property. It is a dimensionality reduction approach for finding a low-dimensional representation with certain preferred characteristics of high-dimensional...
- Nonlinear dimensionality reductionNonlinear dimensionality reductionHigh-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lies on an embedded non-linear manifold within the higher-dimensional space...
- IsomapIsomapIn statistics, Isomap is one of several widely used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling . Isomap is used for computing a quasi-isometric, low-dimensional embedding of a set of high-dimensional data points...
- Kernel PCA
- Multilinear PCA
- Latent semantic analysisLatent semantic analysisLatent semantic analysis is a technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close...
- Partial least squares
- Independent component analysisIndependent component analysisIndependent component analysis is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals...
Image processing
It can be used in the area of image processingImage processing
In electrical engineering and computer science, image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image...
which involves using algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
s to detect and isolate various desired portions or shapes (features) of a digitized image or video stream. It is particularly important in the area of optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...
Low-level
- Edge detectionEdge detectionEdge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...
- Corner detectionCorner detectionCorner detection is an approach used within computer vision systems to extract certain kinds of features and infer the contents of an image. Corner detection is frequently used in motion detection, image registration, video tracking, image mosaicing, panorama stitching, 3D modelling and object...
- Blob detectionBlob detectionIn the area of computer vision, blob detection refers to visual modules that are aimed at detecting points and/or regions in the image that differ in properties like brightness or color compared to the surrounding...
- Ridge detectionRidge detectionThe ridges of a smooth function of two variables is a set of curves whose points are, in one or more ways to be made precise below, local maxima of the function in at least one dimension. For a function of N variables, its ridges are a set of curves whose points are local maxima in N-1 dimensions...
- Scale-invariant feature transformScale-invariant feature transformScale-invariant feature transform is an algorithm in computer vision to detect and describe local features in images. The algorithm was published by David Lowe in 1999....
Shape Based
- ThresholdingThresholding (image processing)Thresholding is the simplest method of image segmentation. From a grayscale image, thresholding can be used to create binary images Thresholding is the simplest method of image segmentation. From a grayscale image, thresholding can be used to create binary images Thresholding is the simplest method...
- Blob extraction
- Template matchingTemplate matchingTemplate matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used in manufacturing as a part of quality control, a way to navigate a mobile robot, or as a way to detect edges in images....
- Hough transformHough transformThe Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure...
- Lines
- Circles/Ellipse
- Arbitrary shapes (Generalized Hough Transform)
Feature extraction in software
Many data analysis software packages provide for feature extraction and dimension reduction. Common numerical programming environments such as MATLABMATLAB
MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...
, SciLab
Scilab
Scilab is an open source, cross-platform numerical computational package and a high-level, numerically oriented programming language. Itcan be used for signal processing, statistical analysis, image enhancement, fluid dynamics simulations, numerical optimization, and modeling and simulation of...
, NumPy and the R language
R (programming language)
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....
provide some of the simpler feature extraction techniques (e.g. principal component analysis) via built-in commands. More specific algorithms are often available as publicly-available scripts or third-party add-ons.
See also
- Cluster analysis
- Dimensionality reductionDimensionality reductionIn machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...
- Feature detectionFeature detectionIn computer vision and image processing the concept of feature detection refers to methods that aim at computing abstractions of image information and making local decisions at every image point whether there is an image feature of a given type at that point or not...
- Feature selectionFeature selectionIn machine learning and statistics, feature selection, also known as variable selection, feature reduction, attribute selection or variable subset selection, is the technique of selecting a subset of relevant features for building robust learning models...
- Data miningData miningData mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...
- Connected-component labeling
- Segmentation (image processing)Segmentation (image processing)In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments . The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze...
External links
- JMLR Special Issue on Variable and Feature Selection
- RapidMiner and Image Processing Extension for Rapidminer (open-source tools)