Derivation of the conjugate gradient method

Numerical linear algebra

Numerical linear algebra is the study of algorithms for performing linear algebra computations, most notably matrix operations, on computers. It is often a fundamental part of engineering and computational science problems, such as image and signal processing, Telecommunication, computational...

, the conjugate gradient method

Conjugate gradient method

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is symmetric and positive-definite. The conjugate gradient method is an iterative method, so it can be applied to sparse systems that are too...

is an iterative method

Iterative method

In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination criteria, is an algorithm of the iterative method...

for numerically solving the linear system

where

is symmetric positive-definite

Positive-definite matrix

In linear algebra, a positive-definite matrix is a matrix that in many ways is analogous to a positive real number. The notion is closely related to a positive-definite symmetric bilinear form ....

. The conjugate gradient method can be derived from several different perspectives, including specialization of the conjugate direction method for optimization

Optimization (mathematics)

In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....

, and variation of the Arnoldi

Arnoldi iteration

In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of iterative methods. Arnoldi finds the eigenvalues of general matrices; an analogous method for Hermitian matrices is the Lanczos iteration. The Arnoldi iteration was invented by W. E...

/Lanczos iteration for eigenvalue problems.

The intent of this article is to document the important steps in these derivations.

Derivation from the conjugate direction method

The conjugate gradient method can be seen as a special case of the conjugate direction method applied to minimization of the quadratic function

The conjugate direction method

In the conjugate direction method for minimizing

one starts with an initial guess

and the corresponding residual

, and computes the iterate and residual by the formulae

where

are a series of mutually conjugate directions, i.e.,

for any

.

The conjugate direction method is imprecise in the sense that no formulae are given for selection of the directions

. Specific choices lead to various methods including the conjugate gradient method and Gaussian elimination

Gaussian elimination

In linear algebra, Gaussian elimination is an algorithm for solving systems of linear equations. It can also be used to find the rank of a matrix, to calculate the determinant of a matrix, and to calculate the inverse of an invertible square matrix...

Derivation from the Arnoldi/Lanczos iteration

The conjugate gradient method can also be seen as a variant of the Arnoldi/Lanczos iteration applied to solving linear systems.

The general Arnoldi method

In the Arnoldi iteration, one starts with a vector

and gradually builds an orthonormal basis

of the Krylov subspace

Krylov subspace

In linear algebra, the order-r Krylov subspace generated by an n-by-n matrix A and a vector b of dimension n is the linear subspace spanned by the images of b under the first r powers of A , that is,...