Kernel adaptive filter
Encyclopedia
Kernel adaptive filtering is an adaptive filtering technique for general nonlinear problems. It is a natural generalization of linear adaptive filtering in reproducing kernel Hilbert space
Reproducing kernel Hilbert space
In functional analysis , a reproducing kernel Hilbert space is a Hilbert space of functions in which pointwise evaluation is a continuous linear functional. Equivalently, they are spaces that can be defined by reproducing kernels...

s. Kernel adaptive filters are online kernel methods, closely related to some artificial neural networks such as radial basis function network
Radial basis function network
A radial basis function network is an artificial neural network that uses radial basis functions as activation functions. It is a linear combination of radial basis functions...

s and regularization networks. Some distinguishing features include: The learning process is online
Online machine learning
In machine learning, online learning is a model of induction that learns one instance at atime. The goal in online learning is to predict labels for instances.For example, the instances could describe the current conditions of...

, the learning process is convex with no local minima, and the learning process requires moderate complexity
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...

.

Adaptive Filtering

A linear adaptive filter is a linear filter built on basic operational units like adders and multipliers and is usually implemented by programmable digital processors. Mathematically it can be modeled by a linear combiner . Supplied with an input , the output of the filter is .

is also called the linear coefficients (weights) of the filter. The dimensionality of is the filter order. A unique feature of an adaptive filter is that its coefficient can be updated online according to some optimization criterion. One common criterion is to minimize the mean square error . As you see, the adaptation of the weights is a supervised learning
Supervised learning
Supervised learning is the machine learning task of inferring a function from supervised training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value...

 process, which requires training data . The updating rule is


where is the filter weight at time i-1. The error is the prediction error of on the i-th datum


And is the algorithm gain, which can assume different formats in different algorithms. The most notable adaptive filters include least mean squares filter
Least mean squares filter
Least mean squares algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean squares of the error signal . It is a stochastic gradient descent method in that the filter is only adapted based on the error at...

 and recursive least squares filter. Despite their simple structure (and probably because of it), they enjoy wide applicability and successes in diverse fields such as communications, control, radar, sonar, seismology, and biomedical engineering, among others. The theory of linear adaptive filters has reached a highly mature stage of development. However, the same can not be said about nonlinear adaptive filters.

Adaptive Filtering in Reproducing Kernel Hilbert Spaces

Kernel adaptive filters are linear adaptive filters in reproducing kernel Hilbert spaces. They belong to a more general methodology called kernel methods
Kernel methods
In computer science, kernel methods are a class of algorithms for pattern analysis, whose best known elementis the support vector machine...

. The main idea of kernel methods can be summarized as follows: transform the input data into a high-dimensional feature space via a positive definite kernel such that the inner product operation in the feature space can be computed efficiently through the kernel evaluation. Then appropriate linear methods are subsequently applied on the transformed data. As long as we can formulate the algorithm in terms of inner product (or equivalent kernel evaluation), we never explicitly have to compute in the high dimensional feature space. While this methodology is called kernel trick, we have to point out that the underlying reproducing kernel Hilbert space plays a central role to provide linearity, convexity, and universal approximation capability at the same time. Successful examples of this methodology include support vector machines, principal component analysis, Fisher discriminant analysis and many others.

Kernel adaptive filters include kernel least mean square, kernel affine projection algorithms, kernel recursive least squares, extended kernel recursive least squares and kernel Kalman filter. Viewed as a learning problem, kernel adaptive filters aim to estimate sequentially by minimizing . The general updating rule of a kernel adaptive filter is


where is the estimate at time . is the prediction error of on the th datum.

Kernel adaptive filters provide a new perspective for linear adaptive filters since linear adaptive filters become a special case being alternatively expressed in the dual space. Kernel adaptive filters clearly show that there is a growing memory structure embedded in the filter weights. They naturally create a growing radial basis function network, learning the network topology and adapting the free parameters directly from data at the same time. The learning rule is a beautiful combination of the error-correction and memory-based learning, and potentially it will have a deep impact on our understanding about the essence of learning theory.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK