data:image/s3,"s3://crabby-images/ca364/ca36463f9b82c90b6f3e27c6326cdcdc31617e4c" alt=""
Preconditioner
Encyclopedia
In mathematics
, preconditioning is a procedure of an application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solution. Preconditioning is typically related to reducing a condition number
of the problem. The preconditioned problem is then usually solved by an iterative method
.
and numerical analysis
, a preconditioner
of a matrix
is a matrix such that
has a smaller condition number
than
. It is also common to call
the preconditioner, rather than
, since
itself is rarely explicitly available. In modern preconditioning, the application of
, i.e., multiplication of a column vector, or a block of column vectors, by
, is commonly performed by rather sophisticated computer software packages in a matrix-free fashion
, i.e., where neither
, nor
(and often not even
) are explicitly available in a matrix form.
Preconditioners are useful in iterative methods to solve a linear system
for
since the rate of convergence
for most iterative linear solvers increases as the condition number
of a matrix decreases as a result of preconditioning. Preconditioned iterative solvers typically outperform direct solvers, e.g., Gaussian elimination
, for large, especially for sparse
, matrices. Iterative solvers can be used as matrix-free methods
, i.e. become the only choice if the coefficient matrix
is not stored explicitly, but is accessed by evaluating matrix-vector products.
via solving
for
and
for
; or the left preconditioned system:
both of which give the same solution as the original system so long as the preconditioner matrix P is nonsingular. The left preconditioning is more common.
The goal of this preconditioned system is to reduce the condition number
of the left or right preconditioned system matrix
or
respectively. The preconditioned matrix
or
is almost never explicitly formed. Only the action of applying the preconditioner solve operation
to a given vector need be computed in iterative methods.
Typically there is a trade-off in the choice of
. Since the operator
must be applied at each step of the iterative linear solver, it should have a small cost (computing time) of applying the
operation. The cheapest preconditioner would therefore be
since then
Clearly, this results in the original linear system and the preconditioner does nothing. At the other extreme, the choice
gives
which has optimal condition number
of 1, requiring a single iteration for convergence; however in this case
and the preconditioner solve is as difficult as solving the original system. One therefore chooses
as somewhere between these two extremes, in an attempt to achieve a minimal number of linear iterations while keeping the operator
as simple as possible. Some examples of typical preconditioning approaches are detailed below.
are, in most cases, mathematically equivalent to standard iterative methods applied to the preconditioned system
For example, the standard Richardson iteration for solving
is
data:image/s3,"s3://crabby-images/b7a48/b7a4865479ceaab3e1d3b4969f574a1d0fd771b1" alt=""
Applied to the preconditioned system
it turns into a preconditioned method
data:image/s3,"s3://crabby-images/57a0e/57a0e3c579c617ed632c3924adbc3684dac310b7" alt=""
Examples of popular preconditioned iterative methods for linear systems include the preconditioned conjugate gradient method, the biconjugate gradient method
, and generalized minimal residual method
. Iterative methods, which use scalar products to compute the iterative parameters, require corresponding changes in the scalar product together with substituting
for data:image/s3,"s3://crabby-images/52400/5240072d98fae06cc1d2ae467104a1b28d99d318" alt=""
matrix
the preconditioner
is typically chosen to be symmetric positive definite as well. The preconditioned operator
is then also symmetric positive definite, but with respect to the
-based scalar product. In this case, the desired effect in applying a preconditioner is to make the quadratic form
of the preconditioned operator
with respect to the
-based scalar product to be nearly sphericalhttp://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf.
, we highlight that preconditioning is practically implemented as multiplying some vector
by
, i.e., computing the product
. In many applications,
is not given as a matrix, but rather as an operator
acting on the vector
. Some popular preconditioners, however, change with
and the dependence on
may not be linear. Typical examples involve using non-linear iterative methods, e.g., the conjugate gradient method
, as a part of the preconditioner construction. Such preconditioners may be practically very efficient, however, their behavior is hard to predict theoretically.
to be bounded from above by a constant independent in the matrix size, which is called spectrally equivalent preconditioning by D'yakonov
. On the other hand, the cost of application of the
should ideally be proportional (also independent in the matrix size) to the cost of multiplication of
by a vector.
Assuming
, we get
It is efficient for diagonally dominant matrices
.
where
is the Frobenius matrix norm and
is from some suitably constrained set of sparse matrices. Under the Frobenius norm, this reduces to solving numerous independent least-squares problems (one for every column). The entries in
must be restricted to some sparsity pattern or the problem becomes as hard and time consuming as finding the exact inverse of
. This method, as well as means to select sparsity patterns, were introduced by [M.J. Grote, T. Huckle, SIAM J. Sci. Comput . 18 (1997) 838–853].
one may be tempted to replace the matrix
with the matrix
using a preconditioner
. However, this makes sense only if the seeking eigenvectors of
and
are the same. This is the case for spectral transformations.
The most popular spectral transformation is the so-called shift-and-invert transformation, where for a given scalar
, called the shift, the original eigenvalue problem
is replaced with the shift-and-invert problem
. The eigenvectors are preserved, and one can solve the shift-and-invert problem by an iterative solver, e.g., the power iteration
. This gives the Inverse iteration
, which normally converges to the eigenvector, corresponding to the eigenvalue closest to the shift
. The Rayleigh quotient iteration
is a shift-and-invert method with a variable shift.
Spectral transformations are specific for eigenvalue problems and have no analogs for linear systems. They require accurate numerical calculation of the transformation involved, which becomes the main bottleneck for large problems.
is known (approximately). Then one can compute the corresponding eigenvector from the homogeneous linear system
. Using the concept of left preconditioning for linear systems, we obtain
, where
is the preconditioner, which we can try to solve using the Richardson iteration
data:image/s3,"s3://crabby-images/81706/81706db37432db86c900593dff0049139fcbd84c" alt=""
is the preconditioner, which makes the Richardson iteration above converge in one step with
, since
, denoted by
, is the orthogonal projector on the eigenspace, corresponding to
. The choice
is impractical for three independent reasons. First,
is actually not known, although it can be replaced with its approximation
. Second, the exact Moore–Penrose pseudoinverse requires the knowledge of the eigenvector, which we are trying to find. This can be somewhat circumvented by the use of the Jacobi-Davidson preconditioner
, where
approximates
. Last, but not least, this approach requires accurate numerical solution of linear system with the system matrix
, which becomes as expensive for large problems as the shift-and-invert method above. If the solution is not accurate enough, step two may be redundant.
in the Richardson iteration above with its current approximation
to obtain a practical algorithm
data:image/s3,"s3://crabby-images/56426/56426924b36bc64e497fef748f9108a9e0cb66e2" alt=""
A popular choice is
using the Rayleigh quotient function
. Practical preconditioning may be as trivial as just using
or
For some classes of eigenvalue problems the efficiency of
has been demonstrated, both numerically and theoretically. The choice
allows one to easily utilize for eigenvalue problems the vast variety of preconditioners developed for linear systems.
Due to the changing value
, a comprehensive theoretical convergence analysis is much more difficult, compared to the linear systems case, even for the simplest methods, such as the Richardson iteration.
In optimization
, preconditioning is typically used to accelerate first-order optimization
algorithms.
using gradient descent
, one takes steps proportional to the negative of the gradient
(or of the approximate gradient) of the function at the current point:
data:image/s3,"s3://crabby-images/009bc/009bcaa6c531558ace9a3c65e4522b956a8d1ce2" alt=""
The preconditioner is applied to the gradient:
data:image/s3,"s3://crabby-images/27937/2793749d1dba001e65fdf92042f8183c20b7096b" alt=""
Preconditioning here can be viewed as changing the geometry of the vector space with the goal to make the level sets look like circles. In this case the preconditioned gradient aims closer to the point of the extrema as on the figure, which speeds up the convergence.
,
where
and
are real column-vectors and
is a real symmetric positive-definite matrix
, is exactly the solution of the linear equation
. Since
, the preconditioned gradient descent
method of minimizing
is
data:image/s3,"s3://crabby-images/c26cb/c26cb07e17920d5233249a0bb99f4a75f207981e" alt=""
This is the preconditioned Richardson iteration for solving a system of linear equations.
data:image/s3,"s3://crabby-images/06265/06265e27f15b3d50f78af5253f665fff5545c75a" alt=""
where
is a real non-zero column-vector and
is a real symmetric positive-definite matrix
, is the smallest eigenvalue of
, while the minimizer is the corresponding eigenvector. Since
is proportional to
, the preconditioned gradient descent
method of minimizing
is
data:image/s3,"s3://crabby-images/a8221/a8221b2b30b30cc0857974fe8ddf4eddcd666cf1" alt=""
This is an analog of preconditioned Richardson iteration for solving eigenvalue problems.
data:image/s3,"s3://crabby-images/47d10/47d1052c0605eeebb99ceb69d7e1d59956b1daa7" alt=""
One should have in mind, however, that constructing an efficient preconditioner is very often computationally expensive. The increased cost of updating the preconditioner can easily override the positive effect of faster convergence.
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...
, preconditioning is a procedure of an application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solution. Preconditioning is typically related to reducing a condition number
Condition number
In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument...
of the problem. The preconditioned problem is then usually solved by an iterative method
Iterative method
In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination criteria, is an algorithm of the iterative method...
.
Preconditioning for linear systems
In linear algebraLinear algebra
Linear algebra is a branch of mathematics that studies vector spaces, also called linear spaces, along with linear functions that input one vector and output another. Such functions are called linear maps and can be represented by matrices if a basis is given. Thus matrix theory is often...
and numerical analysis
Numerical analysis
Numerical analysis is the study of algorithms that use numerical approximation for the problems of mathematical analysis ....
, a preconditioner
data:image/s3,"s3://crabby-images/334f0/334f0d706ee8cd4aba5c8db329a336b9176598cd" alt=""
data:image/s3,"s3://crabby-images/428e6/428e6ae885718668f554b061e0ed60957ac8e11b" alt=""
data:image/s3,"s3://crabby-images/8963f/8963f488ce78de3ae19c2c93ea1233e68182247d" alt=""
Condition number
In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument...
than
data:image/s3,"s3://crabby-images/22011/22011e8655f561c87342d8e48f920d2c616b7023" alt=""
data:image/s3,"s3://crabby-images/2368f/2368fd3b33d759767eca6a171fc5f2d6d3006265" alt=""
data:image/s3,"s3://crabby-images/78373/783732436a22f1d5359d63ce219f825fb9eea382" alt=""
data:image/s3,"s3://crabby-images/7da35/7da35c5e6635c3ecd7aafde09aed0b5cc73703ba" alt=""
data:image/s3,"s3://crabby-images/beb7d/beb7dcbfdd2648bd9d88a3403abd99d67fd46f0b" alt=""
data:image/s3,"s3://crabby-images/840dc/840dcae805bc8ad19db582e52254022a39c07620" alt=""
Matrix-free methods
In computational mathematics, a matrix-free method is an algorithm for solving a linear system of equations or an eigenvalue problem that does not store the coefficient matrix explicitly, but accesses the matrix by evaluating matrix-vector products...
, i.e., where neither
data:image/s3,"s3://crabby-images/e8444/e8444d174608217a3bc0cfc570f1b3886c13c39d" alt=""
data:image/s3,"s3://crabby-images/78a61/78a61ecc7604ee500ed4c3498c0323b5cf9c72fd" alt=""
data:image/s3,"s3://crabby-images/bc9bd/bc9bdbcc170ad5127a2122cfaeb1c7e1eeb769d0" alt=""
Preconditioners are useful in iterative methods to solve a linear system
data:image/s3,"s3://crabby-images/b70ae/b70aed1bc4f866769f6c5d554153136cb0d3fa99" alt=""
data:image/s3,"s3://crabby-images/24496/244963fd3447fa2a8e4aab28bb042317990364b4" alt=""
Rate of convergence
In numerical analysis, the speed at which a convergent sequence approaches its limit is called the rate of convergence. Although strictly speaking, a limit does not give information about any finite first part of the sequence, this concept is of practical importance if we deal with a sequence of...
for most iterative linear solvers increases as the condition number
Condition number
In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument...
of a matrix decreases as a result of preconditioning. Preconditioned iterative solvers typically outperform direct solvers, e.g., Gaussian elimination
Gaussian elimination
In linear algebra, Gaussian elimination is an algorithm for solving systems of linear equations. It can also be used to find the rank of a matrix, to calculate the determinant of a matrix, and to calculate the inverse of an invertible square matrix...
, for large, especially for sparse
Sparse matrix
In the subfield of numerical analysis, a sparse matrix is a matrix populated primarily with zeros . The term itself was coined by Harry M. Markowitz....
, matrices. Iterative solvers can be used as matrix-free methods
Matrix-free methods
In computational mathematics, a matrix-free method is an algorithm for solving a linear system of equations or an eigenvalue problem that does not store the coefficient matrix explicitly, but accesses the matrix by evaluating matrix-vector products...
, i.e. become the only choice if the coefficient matrix
data:image/s3,"s3://crabby-images/43766/43766a939c4b52d2068c36f057ed37f7614628f1" alt=""
Description
Instead of solving the original linear system above, one may solve either the right preconditioned system:via solving
for
data:image/s3,"s3://crabby-images/c471d/c471defa4c3a8279630151aa78fbacd3f340564d" alt=""
for
data:image/s3,"s3://crabby-images/ad8f3/ad8f3fd53dcc0f73fbf590e716b269f112a1ec2f" alt=""
both of which give the same solution as the original system so long as the preconditioner matrix P is nonsingular. The left preconditioning is more common.
The goal of this preconditioned system is to reduce the condition number
Condition number
In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument...
of the left or right preconditioned system matrix
data:image/s3,"s3://crabby-images/d6f86/d6f862765563d1d542eea7098a0e16694f4552ad" alt=""
data:image/s3,"s3://crabby-images/35afc/35afc19c486742fea3116a436fe64c88b96302c9" alt=""
data:image/s3,"s3://crabby-images/cfc03/cfc03b0d7c4918a77772ac9c1234b5750e15d0ec" alt=""
data:image/s3,"s3://crabby-images/54145/54145659e0c61b40fda3260d2bbfb5bf04a9e1a4" alt=""
data:image/s3,"s3://crabby-images/9499c/9499c7b29d58c70eca4c8b2c3c0d942cfeffb929" alt=""
Typically there is a trade-off in the choice of
data:image/s3,"s3://crabby-images/b4240/b424086a681301608b22ebb800b29798a50eb0e4" alt=""
data:image/s3,"s3://crabby-images/47cd7/47cd7526a8257233149736397756d98f3fa25689" alt=""
data:image/s3,"s3://crabby-images/fcffe/fcffe431c01a34c5db44b171dd124dd44ab9b19f" alt=""
data:image/s3,"s3://crabby-images/59d8f/59d8f91f0c4fa4848f13912227c449d8e9cedd4f" alt=""
data:image/s3,"s3://crabby-images/19e2b/19e2bb58ae6439fb7ea3dedc6b5761a79e325780" alt=""
data:image/s3,"s3://crabby-images/72963/7296384eb20b156b2ec402293c0e1cc15cb951d5" alt=""
data:image/s3,"s3://crabby-images/c05a8/c05a8a1062d65a5348f0c7d48ca516ff01646e2a" alt=""
Condition number
In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument...
of 1, requiring a single iteration for convergence; however in this case
data:image/s3,"s3://crabby-images/1f5dc/1f5dc4526c2d0fb409af2adb98f1b97386c7e7cc" alt=""
data:image/s3,"s3://crabby-images/c7498/c7498871097776593403f168a8ddcfcf302270ed" alt=""
data:image/s3,"s3://crabby-images/57160/57160fdfe3e06b303dbafb4b9457ddabb8582a95" alt=""
Preconditioned iterative methods
Preconditioned iterative methods fordata:image/s3,"s3://crabby-images/1c192/1c19209c0bdd2be088bca3089ef7ef02c98f8996" alt=""
data:image/s3,"s3://crabby-images/35eb4/35eb4ca9a56c2f31e0e98af83eacd15ce92f216b" alt=""
data:image/s3,"s3://crabby-images/6c814/6c814581d41568e0353790e46153e019077ba5b9" alt=""
data:image/s3,"s3://crabby-images/b7a48/b7a4865479ceaab3e1d3b4969f574a1d0fd771b1" alt=""
Applied to the preconditioned system
data:image/s3,"s3://crabby-images/e9d7b/e9d7b66658ed6ae6f2f66e707de6b22d2d9d7bbe" alt=""
data:image/s3,"s3://crabby-images/57a0e/57a0e3c579c617ed632c3924adbc3684dac310b7" alt=""
Examples of popular preconditioned iterative methods for linear systems include the preconditioned conjugate gradient method, the biconjugate gradient method
Biconjugate gradient method
In mathematics, more specifically in numerical linear algebra, the biconjugate gradient method is an algorithm to solve systems of linear equationsA x= b.\,...
, and generalized minimal residual method
Generalized minimal residual method
In mathematics, the generalized minimal residual method is an iterative method for the numerical solution of a system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.The GMRES...
. Iterative methods, which use scalar products to compute the iterative parameters, require corresponding changes in the scalar product together with substituting
data:image/s3,"s3://crabby-images/12b81/12b81d902719974defc4738739958ce7fbf26e3a" alt=""
data:image/s3,"s3://crabby-images/52400/5240072d98fae06cc1d2ae467104a1b28d99d318" alt=""
Geometric interpretation
For a symmetric positive definitePositive-definite matrix
In linear algebra, a positive-definite matrix is a matrix that in many ways is analogous to a positive real number. The notion is closely related to a positive-definite symmetric bilinear form ....
matrix
data:image/s3,"s3://crabby-images/c2638/c2638146407159740234db7b0012660cc52095ce" alt=""
data:image/s3,"s3://crabby-images/473d2/473d29203049727dca26f68414079f32be5cba35" alt=""
data:image/s3,"s3://crabby-images/9e41e/9e41e0d07eace1954e0d92d9c9ba93d1ed309767" alt=""
data:image/s3,"s3://crabby-images/3e0bc/3e0bc1f0aded50c021349e4036b76c0912bd7e3b" alt=""
Quadratic form
In mathematics, a quadratic form is a homogeneous polynomial of degree two in a number of variables. For example,4x^2 + 2xy - 3y^2\,\!is a quadratic form in the variables x and y....
of the preconditioned operator
data:image/s3,"s3://crabby-images/360a5/360a5254ce02e358f5a490fcb36c4e9b42c42b7c" alt=""
data:image/s3,"s3://crabby-images/57a2f/57a2f5d035a6c158c3412ba6e5b4e015c0730938" alt=""
Variable and non-linear preconditioning
Denotingdata:image/s3,"s3://crabby-images/a6da9/a6da98e5dca641b1009f6c654f3fd434f1b10476" alt=""
data:image/s3,"s3://crabby-images/42a09/42a09e6b83287a46ba38f28a0cde77fbd15d0fa0" alt=""
data:image/s3,"s3://crabby-images/f65ff/f65ff5e77f810115fdf199bc4e92d340438c81a0" alt=""
data:image/s3,"s3://crabby-images/e42b1/e42b1788fa35779dd0b870bce3a18e0c5ef39370" alt=""
data:image/s3,"s3://crabby-images/ed088/ed0889623c9387343baaa969ac2529d8886f17e5" alt=""
data:image/s3,"s3://crabby-images/8941e/8941e0a742412d901b8a655af98759160931bd1c" alt=""
data:image/s3,"s3://crabby-images/80383/80383e81582271bcf2e460a90ae0d02a4f821c6a" alt=""
data:image/s3,"s3://crabby-images/93e3b/93e3b6c1db8beb2c19ad793ab87bd477f6471194" alt=""
data:image/s3,"s3://crabby-images/b4803/b480351bea8cdf8e47d713e1e5c0e8babb045f7c" alt=""
Conjugate gradient method
In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is symmetric and positive-definite. The conjugate gradient method is an iterative method, so it can be applied to sparse systems that are too...
, as a part of the preconditioner construction. Such preconditioners may be practically very efficient, however, their behavior is hard to predict theoretically.
Spectrally equivalent preconditioning
The most common use of preconditioning is for iterative solution of linear systems resulting from approximations of partial differential equations. The better the approximation quality, the larger the matrix size is. In such a case, the goal of optimal preconditioning is, on the one side, to make the spectral condition number ofdata:image/s3,"s3://crabby-images/c0b21/c0b21dafb860858a3e5b779fdbce1055fa7f6f17" alt=""
Evgenii Georgievich D'yakonov
Evgenii Georgievich D'yakonov was a Russian mathematician.D'yakonov was a Ph.D. student of Sergei Sobolev. He worked at the Moscow State University. He authored over hundred papers and several books...
. On the other hand, the cost of application of the
data:image/s3,"s3://crabby-images/f1597/f15970315ab51d5644652430ee23d0d1654a700c" alt=""
data:image/s3,"s3://crabby-images/8e91f/8e91ffc68224ce7fdc66cd385d3e8cfbeba3907c" alt=""
Jacobi (or diagonal) preconditioner
The Jacobi preconditioner is one of the simplest forms of preconditioning, in which the preconditioner is chosen to be the diagonal of the matrixdata:image/s3,"s3://crabby-images/410ab/410abf8c03bd86e2ac1ba7a3344c2309ab4fdc77" alt=""
data:image/s3,"s3://crabby-images/33592/33592c5466d46df202a23de5cf69455edad33261" alt=""
data:image/s3,"s3://crabby-images/ff358/ff358341c1e82c1c9d2f639ac556840d21fd3f02" alt=""
Diagonally dominant matrix
In mathematics, a matrix is said to be diagonally dominant if for every row of the matrix, the magnitude of the diagonal entry in a row is larger than or equal to the sum of the magnitudes of all the other entries in that row...
data:image/s3,"s3://crabby-images/34470/344704f2f86e00b1be1129098583a84be9084339" alt=""
SPAI
The Sparse Approximate Inverse preconditioner minimisesdata:image/s3,"s3://crabby-images/d5070/d5070b0712586b2ce3fca21c2b99a5ae72845bdd" alt=""
data:image/s3,"s3://crabby-images/54920/54920a6a9439fcdcf1dd7a2d9c6ae98d77c5347b" alt=""
data:image/s3,"s3://crabby-images/da8a7/da8a73b65b272d19f37f944482084f3010622b24" alt=""
data:image/s3,"s3://crabby-images/2c45b/2c45b100edfc07277f7ab77d657029002a546b43" alt=""
data:image/s3,"s3://crabby-images/cbd78/cbd7872a1165a8bce2edf7a206a67ceee660a4fd" alt=""
Other preconditioners
- Incomplete Cholesky factorizationIncomplete Cholesky factorizationIn numerical analysis, an incomplete Cholesky factorization of a symmetric positive definite matrix is a sparse approximation of the Cholesky factorization...
- Incomplete LU factorizationIncomplete LU factorizationIn numerical analysis, a field within mathematics, an incomplete LU factorization of a matrix is an sparse approximation of the LU factorization. Incomplete LU factorization is often used as a preconditioner.-External links:*...
- Successive over-relaxationSuccessive over-relaxationIn numerical linear algebra, the method of successive over-relaxation is a variant of the Gauss–Seidel method for solving a linear system of equations, resulting in faster convergence. A similar method can be used for any slowly converging iterative process.It was devised simultaneously by David...
- Multigrid#Multigrid preconditioning
External links
- Preconditioned Conjugate Gradient – math-linux.com
- Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods
Preconditioning for eigenvalue problems
Eigenvalue problems can be framed in several alternative ways, each leading to its own preconditioning. The traditional preconditioning is based on the so-called spectral transformations. Knowing (approximately) the targeted eigenvalue, one can compute the corresponding eigenvector by solving the related homogeneous linear system, thus allowing to use preconditioning for linear system. Finally, formulating the eigenvalue problem as optimization of the Rayleigh quotient brings preconditioned optimization techniques to the scene.Spectral transformations
By analogy with linear systems, for an eigenvalue problemdata:image/s3,"s3://crabby-images/5c6f9/5c6f9c62d72a3e3d61a4b5ea3410ff571abcc904" alt=""
data:image/s3,"s3://crabby-images/2a60c/2a60c9bee897af65b83972ead3f96415cc28207c" alt=""
data:image/s3,"s3://crabby-images/bebcc/bebcc6706a2f29de712c15fe192f970fb7cfcada" alt=""
data:image/s3,"s3://crabby-images/d0885/d0885cf7091f3e7d577e1870a04da970645fc43d" alt=""
data:image/s3,"s3://crabby-images/5a204/5a20407354dd6d50aacfa669a1961803ae77e4e4" alt=""
data:image/s3,"s3://crabby-images/24eac/24eacd1b951d922c1bfad430fdbc5e56a66dd7f8" alt=""
The most popular spectral transformation is the so-called shift-and-invert transformation, where for a given scalar
data:image/s3,"s3://crabby-images/1d5e1/1d5e186a25e79af7a8b480aecd40a8a3cba63ca9" alt=""
data:image/s3,"s3://crabby-images/21143/211437cfb950e0f8df1bbb115a369b7601d0584e" alt=""
data:image/s3,"s3://crabby-images/18d36/18d36c6906786d28454a885b387107ba3852cabf" alt=""
Power iteration
In mathematics, the power iteration is an eigenvalue algorithm: given a matrix A, the algorithm will produce a number λ and a nonzero vector v , such that Av = λv....
. This gives the Inverse iteration
Inverse iteration
In numerical analysis, inverse iteration is an iterative eigenvalue algorithm. It allows to find an approximateeigenvector when an approximation to a corresponding eigenvalue is already known....
, which normally converges to the eigenvector, corresponding to the eigenvalue closest to the shift
data:image/s3,"s3://crabby-images/6e1c4/6e1c4b83f09737dc9728566dffae4ad424cca137" alt=""
Rayleigh quotient iteration
Rayleigh quotient iteration is an eigenvalue algorithm which extends the idea of the inverse iteration by using the Rayleigh quotient to obtain increasingly accurate eigenvalue estimates....
is a shift-and-invert method with a variable shift.
Spectral transformations are specific for eigenvalue problems and have no analogs for linear systems. They require accurate numerical calculation of the transformation involved, which becomes the main bottleneck for large problems.
General preconditioning
To make a close connection to linear systems, let us suppose that the targeted eigenvaluedata:image/s3,"s3://crabby-images/59faa/59faafcaf64631fdc367242d8cf4d7a323dc437d" alt=""
data:image/s3,"s3://crabby-images/16f9e/16f9e46e4e6b56fce54dac15d497b47a168a203d" alt=""
data:image/s3,"s3://crabby-images/dd6d1/dd6d1f44368ab71b0197648a4e673e42c2afb849" alt=""
data:image/s3,"s3://crabby-images/b360b/b360b9a3eb4401a000aaa647dba8bdfa396779fd" alt=""
data:image/s3,"s3://crabby-images/81706/81706db37432db86c900593dff0049139fcbd84c" alt=""
The ideal preconditioning
The Moore–Penrose pseudoinversedata:image/s3,"s3://crabby-images/651cc/651ccadbebde6dc5fecaa3fc1ec15c864c1c59dd" alt=""
data:image/s3,"s3://crabby-images/cb0b9/cb0b99e49f2b1fdcb47e85f7d7f40d692ade7150" alt=""
data:image/s3,"s3://crabby-images/68903/689039145c8301d3c6282d3bb2f51704f02de77e" alt=""
data:image/s3,"s3://crabby-images/1f004/1f004b8094b7d0bbc65bf63662411dcd243ef7f7" alt=""
data:image/s3,"s3://crabby-images/84a3f/84a3f53ab8a3212a637228959906bae3d08da9ce" alt=""
data:image/s3,"s3://crabby-images/e49be/e49bed2c9a5813c1ac1dc069d0337bea5408baf2" alt=""
data:image/s3,"s3://crabby-images/ef9f2/ef9f2c249f44f90dd5a19cb056546d034757c179" alt=""
data:image/s3,"s3://crabby-images/b8ba9/b8ba9dc45990333c457db4e9b0182d728a9e3a68" alt=""
data:image/s3,"s3://crabby-images/f4255/f4255a79bfaa7ff5cce30607e8113e29b2fdc7dd" alt=""
data:image/s3,"s3://crabby-images/b079b/b079b8da3db44421212403e84a5e416909f8e450" alt=""
data:image/s3,"s3://crabby-images/8318d/8318de4bab779b2ce33fb245b5eb91daf114e84b" alt=""
data:image/s3,"s3://crabby-images/2de0f/2de0fbeab6d86170b5caabddbe03986994be2e65" alt=""
Practical preconditioning
Let us first replace the theoretical valuedata:image/s3,"s3://crabby-images/a11a4/a11a46ae76b846685803541b164d7211c66ef3c1" alt=""
data:image/s3,"s3://crabby-images/186b1/186b19cbb0ec1fae97432035d60b6f530e8e38c7" alt=""
data:image/s3,"s3://crabby-images/56426/56426924b36bc64e497fef748f9108a9e0cb66e2" alt=""
A popular choice is
data:image/s3,"s3://crabby-images/be6b2/be6b2692ce401d0a3b6516f10874a85329f1d0f8" alt=""
data:image/s3,"s3://crabby-images/0bedd/0bedda6b2bb355403937ffe95930ec664de88b52" alt=""
data:image/s3,"s3://crabby-images/8123c/8123cfebc019e3106c65a7019ac126a3bbc59e01" alt=""
data:image/s3,"s3://crabby-images/812c3/812c3604302fe837a6ceb3357ef3491028e70364" alt=""
data:image/s3,"s3://crabby-images/e89c3/e89c34b9f2d0d1651b3ec5164f0e6620fd864a20" alt=""
data:image/s3,"s3://crabby-images/bc198/bc1984ed1915d9ea2ddc7756b62b5fd0e8235c8a" alt=""
Due to the changing value
data:image/s3,"s3://crabby-images/63c28/63c28da17ee80f282139bb5c5b837c4f55860e5b" alt=""
External links
Preconditioning in optimization
data:image/s3,"s3://crabby-images/41472/41472985072833d8ff50bc552f80b76d13efc00d" alt=""
Optimization (mathematics)
In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
, preconditioning is typically used to accelerate first-order optimization
Optimization (mathematics)
In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
algorithms.
Description
For example, to find a local minimum of a real-valued functiondata:image/s3,"s3://crabby-images/8a0ad/8a0ad4064061c8fa89214add34c73e80d2c4f650" alt=""
Gradient descent
Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point...
, one takes steps proportional to the negative of the gradient
Gradient
In vector calculus, the gradient of a scalar field is a vector field that points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change....
data:image/s3,"s3://crabby-images/a750e/a750e93b361db7cd3c40a64a8d4b6a6a41f770f6" alt=""
data:image/s3,"s3://crabby-images/009bc/009bcaa6c531558ace9a3c65e4522b956a8d1ce2" alt=""
The preconditioner is applied to the gradient:
data:image/s3,"s3://crabby-images/27937/2793749d1dba001e65fdf92042f8183c20b7096b" alt=""
Preconditioning here can be viewed as changing the geometry of the vector space with the goal to make the level sets look like circles. In this case the preconditioned gradient aims closer to the point of the extrema as on the figure, which speeds up the convergence.
Connection to linear systems
The minimum of a quadratic functiondata:image/s3,"s3://crabby-images/7e677/7e6774a5074bbed34ce7b848c5a045037c91b3dd" alt=""
where
data:image/s3,"s3://crabby-images/99772/99772a30b837a54e1ca7e9fdf09b4eaffee5005b" alt=""
data:image/s3,"s3://crabby-images/278b6/278b63e2ae0944fe0524114e65bce690f7460c61" alt=""
data:image/s3,"s3://crabby-images/beb27/beb275ca11b7df9336215ad9d5fa69bfd5080bf6" alt=""
Positive-definite matrix
In linear algebra, a positive-definite matrix is a matrix that in many ways is analogous to a positive real number. The notion is closely related to a positive-definite symmetric bilinear form ....
, is exactly the solution of the linear equation
data:image/s3,"s3://crabby-images/42e07/42e07af1fe6d7c82a48d048a593c3c6083693ad9" alt=""
data:image/s3,"s3://crabby-images/bae9c/bae9cfb0047f1905145e8b0f0d5de1a7bcfb4a4b" alt=""
Gradient descent
Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point...
method of minimizing
data:image/s3,"s3://crabby-images/ab9f3/ab9f3e95c18e82dbc7dfcf61903ead664eaf60bf" alt=""
data:image/s3,"s3://crabby-images/c26cb/c26cb07e17920d5233249a0bb99f4a75f207981e" alt=""
This is the preconditioned Richardson iteration for solving a system of linear equations.
Connection to eigenvalue problems
The minimum of the Rayleigh quotientdata:image/s3,"s3://crabby-images/06265/06265e27f15b3d50f78af5253f665fff5545c75a" alt=""
where
data:image/s3,"s3://crabby-images/fda5e/fda5e8a764a1025bf5c0b1babdb58667ce082b59" alt=""
data:image/s3,"s3://crabby-images/5e90f/5e90f1dd32d8ad163302c932fb2b85e2f3ea89d9" alt=""
Positive-definite matrix
In linear algebra, a positive-definite matrix is a matrix that in many ways is analogous to a positive real number. The notion is closely related to a positive-definite symmetric bilinear form ....
, is the smallest eigenvalue of
data:image/s3,"s3://crabby-images/5c1d4/5c1d4d6a9c46d907369e082cbdca37b44ce927ee" alt=""
data:image/s3,"s3://crabby-images/4de91/4de911fe41e98a14a33b8b08d05f2472d79e9c90" alt=""
data:image/s3,"s3://crabby-images/fe1fc/fe1fcbcd91c6e240e80208f235f4dded863b9dd9" alt=""
Gradient descent
Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point...
method of minimizing
data:image/s3,"s3://crabby-images/3ec5e/3ec5e4893ed04dd2b2cfb4e0092485f764412315" alt=""
data:image/s3,"s3://crabby-images/a8221/a8221b2b30b30cc0857974fe8ddf4eddcd666cf1" alt=""
This is an analog of preconditioned Richardson iteration for solving eigenvalue problems.
Variable preconditioning
In many cases, it may be beneficial to change the preconditioner at some or even every step of an iterative algorithm in order to accommodate for a changing shape of the level sets, as indata:image/s3,"s3://crabby-images/47d10/47d1052c0605eeebb99ceb69d7e1d59956b1daa7" alt=""
One should have in mind, however, that constructing an efficient preconditioner is very often computationally expensive. The increased cost of updating the preconditioner can easily override the positive effect of faster convergence.