Matrix exponential
Encyclopedia
In mathematics
, the matrix exponential is a matrix function
on square matrices analogous to the ordinary exponential function
. Abstractly, the matrix exponential gives the connection between a matrix Lie algebra
and the corresponding Lie group
.
Let X be an n×n real
or complex
matrix
. The exponential of X, denoted by e^{X} or exp(X), is the n×n matrix given by the power series
The above series always converges, so the exponential of X is welldefined. Note that if X is a 1×1 matrix the matrix exponential of X is a 1×1 matrix consisting of the ordinary exponential of the single element of X.
by I and the zero matrix by 0. The matrix exponential satisfies the following properties:
One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linear ordinary differential equations. The solution of
where A is a constant matrix, is given by
The matrix exponential can also be used to solve the inhomogeneous equation
See the section on applications below for examples.
There is no closedform solution for differential equations of the form
where A is not constant, but the Magnus series gives the solution as an infinite sum.
However, if they do not commute, then the above equality does not necessarily hold, in which case we can use the Baker–Campbell–Hausdorff formula to compute e^{}X + Y.
The converse is false: the equation e^{}X + Y = e^{}Xe^{}Y does not necessarily imply that X and Y commute. However, the converse is true if X and Y contain only algebraic number
s and their size is at least 2×2 .
from the space of all n×n matrices to the general linear group
of degree n, i.e. the group
of all n×n invertible matrices. In fact, this map is surjective which means that every invertible matrix can be written as the exponential of some other matrix (for this, it is essential to consider the field C of complex numbers and not R).
For any two matrices X and Y, we have
where  ·  denotes an arbitrary matrix norm. It follows that the exponential map is continuous and Lipschitz continuous on compact subsets of M_{}n(C).
The map
defines a smooth
curve in the general linear group which passes through the identity element at t = 0. In fact, this gives a oneparameter subgroup of the general linear group since
The derivative of this curve (or tangent vector
) at a point t is given by
The derivative at t = 0 is just the matrix X, which is to say that X generates this oneparameter subgroup.
More generally,
Taking in above expression outside the integral sign and expanding the integrand with the help of the Hadamard lemma one can obtain the following useful expression for the derivative of matrix exponent:
In addition to providing a computational tool, this formula shows that a matrix exponential is always an invertible matrix. This follows from the fact
the right hand side of the above equation is always nonzero, and so which means that must be invertible. Another observation is the following:
in the realvalued case, we see that the map
is not surjective (this is in contrast with the complex case mentioned earlier). This follows from the fact that (for realvalued matrices) the right hand side of the above equation is always positive while there exist invertible matrices with a negative determinant.
and GNU Octave
use Padé approximant
. Several methods are listed below.
:
then its exponential can be obtained by just exponentiating every entry on the main diagonal:
This also allows one to exponentiate diagonalizable matrices
. If and D is diagonal, then . Application of Sylvester's formula
yields the same result.
if N^{}q = 0 for some integer q. In this case, the matrix exponential e^{}N can be computed directly from the series expansion, as the series terminates after a finite number of terms:
of a matrix X can be factored into a product of first degree polynomials, it can be expressed as a sum
where
This is the Jordan–Chevalley decomposition.
This means that we can compute the exponential of X by reducing to the previous two cases:
Note that we need the commutativity of A and N for the last step to work.
Another (closely related) method if the field is algebraically closed is to work with the Jordan form of X. Suppose that X = PJP^{ −1} where J is the Jordan form of X. Then
Also, since
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...
, the matrix exponential is a matrix function
Matrix function
In mathematics, a matrix function is a function which maps a matrix to another matrix. Extending scalar functions to matrix functions :There are several techniques for lifting a real function to a square matrix function such that interesting properties are maintained...
on square matrices analogous to the ordinary exponential function
Exponential function
In mathematics, the exponential function is the function ex, where e is the number such that the function ex is its own derivative. The exponential function is used to model a relationship in which a constant change in the independent variable gives the same proportional change In mathematics,...
. Abstractly, the matrix exponential gives the connection between a matrix Lie algebra
Lie algebra
In mathematics, a Lie algebra is an algebraic structure whose main use is in studying geometric objects such as Lie groups and differentiable manifolds. Lie algebras were introduced to study the concept of infinitesimal transformations. The term "Lie algebra" was introduced by Hermann Weyl in the...
and the corresponding Lie group
Lie group
In mathematics, a Lie group is a group which is also a differentiable manifold, with the property that the group operations are compatible with the smooth structure...
.
Let X be an n×n real
Real number
In mathematics, a real number is a value that represents a quantity along a continuum, such as 5 , 4/3 , 8.6 , √2 and π...
or complex
Complex number
A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the onedimensional number line to the twodimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...
matrix
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...
. The exponential of X, denoted by e^{X} or exp(X), is the n×n matrix given by the power series
The above series always converges, so the exponential of X is welldefined. Note that if X is a 1×1 matrix the matrix exponential of X is a 1×1 matrix consisting of the ordinary exponential of the single element of X.
Properties
Let X and Y be n×n complex matrices and let a and b be arbitrary complex numbers. We denote the n×n identity matrixIdentity matrix
In linear algebra, the identity matrix or unit matrix of size n is the n×n square matrix with ones on the main diagonal and zeros elsewhere. It is denoted by In, or simply by I if the size is immaterial or can be trivially determined by the context...
by I and the zero matrix by 0. The matrix exponential satisfies the following properties:
 e^{0} = I.
 e^{}aXe^{}bX = e^{(}a + b)X.
 e^{}Xe^{−}X = I.
 If XY = YX then e^{}Xe^{}Y = e^{}Ye^{}X = e^{(}X + Y).
 If Y is invertible then e^{}YXY^{−1} = Ye^{}XY^{−1}.
 exp(X^{T}) = (exp X)^{T}, where X^{T} denotes the transposeTransposeIn linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...
of X. It follows that if X is symmetric then e^{}X is also symmetric, and that if X is skewsymmetricSkewsymmetric matrixIn mathematics, and in particular linear algebra, a skewsymmetric matrix is a square matrix A whose transpose is also its negative; that is, it satisfies the equation If the entry in the and is aij, i.e...
then e^{}X is orthogonalOrthogonal matrixIn linear algebra, an orthogonal matrix , is a square matrix with real entries whose columns and rows are orthogonal unit vectors ....
.  exp(X*) = (exp X)*, where X* denotes the conjugate transposeConjugate transposeIn mathematics, the conjugate transpose, Hermitian transpose, Hermitian conjugate, or adjoint matrix of an mbyn matrix A with complex entries is the nbym matrix A* obtained from A by taking the transpose and then taking the complex conjugate of each entry...
of X. It follows that if X is Hermitian then e^{}X is also Hermitian, and that if X is skewHermitianSkewHermitian matrixIn linear algebra, a square matrix with complex entries is said to be skewHermitian or antihermitian if its conjugate transpose is equal to its negative. That is, the matrix A is skewHermitian if it satisfies the relationA^\dagger = A,\;...
then e^{}X is unitary.
Linear differential equation systems
Main article: matrix differential equationMatrix differential equation
A differential equation is a mathematical equation for an unknown function of one or several variables that relates the values of the function itself and of its derivatives of various orders...
One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linear ordinary differential equations. The solution of
where A is a constant matrix, is given by
The matrix exponential can also be used to solve the inhomogeneous equation
See the section on applications below for examples.
There is no closedform solution for differential equations of the form
where A is not constant, but the Magnus series gives the solution as an infinite sum.
The exponential of sums
We know that the exponential function satisfies e^{}x + y = e^{}xe^{}y for any real numbers (scalars) x and y. The same goes for commuting matrices: If the matrices X and Y commute (meaning that XY = YX), thenHowever, if they do not commute, then the above equality does not necessarily hold, in which case we can use the Baker–Campbell–Hausdorff formula to compute e^{}X + Y.
The converse is false: the equation e^{}X + Y = e^{}Xe^{}Y does not necessarily imply that X and Y commute. However, the converse is true if X and Y contain only algebraic number
Algebraic number
In mathematics, an algebraic number is a number that is a root of a nonzero polynomial in one variable with rational coefficients. Numbers such as π that are not algebraic are said to be transcendental; almost all real numbers are transcendental...
s and their size is at least 2×2 .
The exponential map
Note that the exponential of a matrix is always an invertible matrix. The inverse matrix of e^{}X is given by e^{−}X. This is analogous to the fact that the exponential of a complex number is always nonzero. The matrix exponential then gives us a mapfrom the space of all n×n matrices to the general linear group
General linear group
In mathematics, the general linear group of degree n is the set of n×n invertible matrices, together with the operation of ordinary matrix multiplication. This forms a group, because the product of two invertible matrices is again invertible, and the inverse of an invertible matrix is invertible...
of degree n, i.e. the group
Group (mathematics)
In mathematics, a group is an algebraic structure consisting of a set together with an operation that combines any two of its elements to form a third element. To qualify as a group, the set and the operation must satisfy a few conditions called group axioms, namely closure, associativity, identity...
of all n×n invertible matrices. In fact, this map is surjective which means that every invertible matrix can be written as the exponential of some other matrix (for this, it is essential to consider the field C of complex numbers and not R).
For any two matrices X and Y, we have
where  ·  denotes an arbitrary matrix norm. It follows that the exponential map is continuous and Lipschitz continuous on compact subsets of M_{}n(C).
The map
defines a smooth
Smooth
Smooth means having a texture that lacks friction. Not rough.Smooth may also refer to:In mathematics:* Smooth function, a function that is infinitely differentiable; used in calculus and topology...
curve in the general linear group which passes through the identity element at t = 0. In fact, this gives a oneparameter subgroup of the general linear group since
The derivative of this curve (or tangent vector
Tangent vector
A tangent vector is a vector that is tangent to a curve or surface at a given point.Tangent vectors are described in the differential geometry of curves in the context of curves in Rn. More generally, tangent vectors are elements of a tangent space of a differentiable manifold....
) at a point t is given by
The derivative at t = 0 is just the matrix X, which is to say that X generates this oneparameter subgroup.
More generally,
Taking in above expression outside the integral sign and expanding the integrand with the help of the Hadamard lemma one can obtain the following useful expression for the derivative of matrix exponent:
The determinant of the matrix exponential
It can be shown that for any complex square matrix, the following identity holds:In addition to providing a computational tool, this formula shows that a matrix exponential is always an invertible matrix. This follows from the fact
the right hand side of the above equation is always nonzero, and so which means that must be invertible. Another observation is the following:
in the realvalued case, we see that the map
is not surjective (this is in contrast with the complex case mentioned earlier). This follows from the fact that (for realvalued matrices) the right hand side of the above equation is always positive while there exist invertible matrices with a negative determinant.
Computing the matrix exponential
Finding reliable and accurate methods to compute the matrix exponential is difficult, and this is still a topic of considerable current research in mathematics and numerical analysis. Both MatlabMATLAB
MATLAB is a numerical computing environment and fourthgeneration programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...
and GNU Octave
GNU Octave
GNU Octave is a highlevel language, primarily intended for numerical computations. It provides a convenient commandline interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with MATLAB...
use Padé approximant
Padé approximant
Padé approximant is the "best" approximation of a function by a rational function of given order  under this technique, the approximant's power series agrees with the power series of the function it is approximating....
. Several methods are listed below.
Diagonalizable case
If a matrix is diagonalDiagonal matrix
In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero. The diagonal entries themselves may or may not be zero...
:
then its exponential can be obtained by just exponentiating every entry on the main diagonal:
This also allows one to exponentiate diagonalizable matrices
Diagonalizable matrix
In linear algebra, a square matrix A is called diagonalizable if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P such that P −1AP is a diagonal matrix...
. If and D is diagonal, then . Application of Sylvester's formula
Sylvester's formula
In matrix theory, Sylvester's formula or Sylvester's matrix theorem expresses an analytic function f of a matrix A in terms of the eigenvalues and eigenvectors of A...
yields the same result.
Nilpotent case
A matrix N is nilpotentNilpotent matrix
In linear algebra, a nilpotent matrix is a square matrix N such thatN^k = 0\,for some positive integer k. The smallest such k is sometimes called the degree of N....
if N^{}q = 0 for some integer q. In this case, the matrix exponential e^{}N can be computed directly from the series expansion, as the series terminates after a finite number of terms:
Generalization
When the minimal polynomialMinimal polynomial (linear algebra)
In linear algebra, the minimal polynomial of an nbyn matrix A over a field F is the monic polynomial P over F of least degree such that P=0...
of a matrix X can be factored into a product of first degree polynomials, it can be expressed as a sum
where
 A is diagonalizable
 N is nilpotent
 A commutes with N (i.e. AN = NA)
This is the Jordan–Chevalley decomposition.
This means that we can compute the exponential of X by reducing to the previous two cases:
Note that we need the commutativity of A and N for the last step to work.
Another (closely related) method if the field is algebraically closed is to work with the Jordan form of X. Suppose that X = PJP^{ −1} where J is the Jordan form of X. Then
Also, since

Therefore, we need only know how to compute the matrix exponential of a Jordan block. But each Jordan block is of the form
where N is a special nilpotent matrix. The matrix exponential of this block is given by
Alternative
If P and Q_{t} are nonzero polynomials in one variable, such that P(A) = 0, and if the meromorphic functionMeromorphic functionIn complex analysis, a meromorphic function on an open subset D of the complex plane is a function that is holomorphic on all D except a set of isolated points, which are poles for the function...
is entireEntire functionIn complex analysis, an entire function, also called an integral function, is a complexvalued function that is holomorphic over the whole complex plane...
, then.
To prove this, multiply the first of the two above equalities by P(z) and replace z by A.
Such a polynomial Q_{t} can be found as follows. Let a be a root of P, and Q_{a,t} the product of P by the principal part of the Laurent seriesLaurent seriesIn mathematics, the Laurent series of a complex function f is a representation of that function as a power series which includes terms of negative degree. It may be used to express complex functions in cases where...
of f at a. Then the sum S_{t} of the Q_{a,t}, where a runs over all the roots of P, can be taken as a particular Q_{t}. All the other Q_{t} will be obtained by adding a multiple of P to S_{t}. In particular S_{t} is the only Q_{t} whose degree is less than that of P.
Consider the case of a 2by2 matrix
The exponential matrix is of the form . (For any complex number and any algebra we denote again by the product of by the unit of .) Let and be the roots of the characteristic polynomialCharacteristic polynomialIn linear algebra, one associates a polynomial to every square matrix: its characteristic polynomial. This polynomial encodes several important properties of the matrix, most notably its eigenvalues, its determinant and its trace....
Then we have
if , and
if .
In either case, writing:
and
where is 0 if , and 1 if .
The polynomial can also be given the following "interpolationInterpolationIn the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points....
" characterization. Put , . Then is the unique
degree polynomial which satisfies whenever is less than the multiplicity of as a root of .
We assume (as we obviously can) that is the minimal polynomial of .
We also assume that is a diagonalizable matrixDiagonalizable matrixIn linear algebra, a square matrix A is called diagonalizable if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P such that P −1AP is a diagonal matrix...
. In particular, the roots of are simple, and the "interpolationInterpolationIn the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points....
" characterization tells us that is given by the Lagrange interpolation formula.
At the other extreme, if , then
The simplest case not covered by the above observations is when with , which gives
via Laplace Transform
As above we know that the solution to the system linear differential equations given by is . Using the Laplace transform, letting , and applying to the differential equation we get
where is the identity matrixIdentity matrixIn linear algebra, the identity matrix or unit matrix of size n is the n×n square matrix with ones on the main diagonal and zeros elsewhere. It is denoted by In, or simply by I if the size is immaterial or can be trivially determined by the context...
. Therefore . Thus it can be concluded that . And from this we can find by setting .
Calculations
Suppose that we want to compute the exponential of

Its Jordan form is

where the matrix P is given by

Let us first calculate exp(J). We have
The exponential of a 1×1 matrix is just the exponential of the one entry of the matrix, so exp(J_{1}(4)) = [e^{4}]. The exponential of J_{2}(16) can be calculated by the formula exp(λ + N) = e^{λ} exp(N) mentioned above; this yields

Therefore, the exponential of the original matrix B is

Linear differential equations
The matrix exponential has applications to systems of linear differential equationLinear differential equationLinear differential equations are of the formwhere the differential operator L is a linear operator, y is the unknown function , and the right hand side ƒ is a given function of the same nature as y...
s. (See also matrix differential equationMatrix differential equationA differential equation is a mathematical equation for an unknown function of one or several variables that relates the values of the function itself and of its derivatives of various orders...
.) Recall from earlier in this article that a differential equation of the form
has solution e^{C}ty(0). If we consider the vector
we can express a system of coupled linear differential equations as
If we make an ansatzAnsatzAnsatz is a German noun with several meanings in the English language.It is widely encountered in physics and mathematics literature.Since ansatz is a noun, in German texts the initial a of this word is always capitalised.Definition:...
and use an integrating factor of e^{−}At and multiply throughout, we obtain
The second step is possible due to the fact that if AB=BA then . If we can calculate e^{}At, then we can obtain the solution to the system.
Example (homogeneous)
Say we have the system
We have the associated matrix
The matrix exponential
so the general solution of the system is

that is,
Inhomogeneous case – variation of parameters
For the inhomogeneous case, we can use integrating factorIntegrating factorIn mathematics, an integrating factor is a function that is chosen to facilitate the solving of a given equation involving differentials. It is commonly used to solve ordinary differential equations, but is also used within multivariable calculus, in this case often multiplying through by an...
s (a method akin to variation of parameters). We seek a particular solution of the form :

For y_{p} to be a solution:

So,
where c is determined by the initial conditions of the problem.
More precisely, consider the equation
with the initial condition , where
is an by complex matrix,
is a continuous function from some open interval to ,
is a point of , and
is a vector of .
Left multiplying the above displayed equality by , we get
We claim that the solution to the equation
with the initial conditions for is
where the notation is as follows:
is a monic polynomial of degree ,
is a continuous complex valued function defined on some open interval ,
is a point of ,
is a complex number, and
is the coefficient of in the polynomial denoted by in Subsection Alternative above.
To justify this claim, we transform our order n scalar equation into an order one vector equation by the usual reduction to a first order system. Our vector equation takes the form
where A is the transposeTransposeIn linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...
companion matrix of P. We solve this equation as explained above, computing the matrix exponentials by the observation made in Subsection Alternative above.
In the case we get the following statement. The solution to
is
where the functions and are as in Subsection Alternative above.
Example (inhomogeneous)
Say we have the system
So we then have
and
From before, we have the general solution to the homogeneous equation, Since the sum of the homogeneous and particular solutions give the general solution to the inhomogeneous problem, now we only need to find the particular solution (via variation of parameters).
We have, above:


which can be further simplified to get the requisite particular solution determined through variation of parameters.
See also
 Matrix functionMatrix functionIn mathematics, a matrix function is a function which maps a matrix to another matrix. Extending scalar functions to matrix functions :There are several techniques for lifting a real function to a square matrix function such that interesting properties are maintained...
 Matrix logarithm
 Exponential functionExponential functionIn mathematics, the exponential function is the function ex, where e is the number such that the function ex is its own derivative. The exponential function is used to model a relationship in which a constant change in the independent variable gives the same proportional change In mathematics,...
 Exponential mapExponential mapIn differential geometry, the exponential map is a generalization of the ordinary exponential function of mathematical analysis to all differentiable manifolds with an affine connection....
 Vector flowVector flowIn mathematics, the vector flow refers to a set of closely related concepts of the flow determined by a vector field. These appear in a number of different contexts, including differential topology, Riemannian geometry and Lie group theory...
 Golden–Thompson inequality
 Phasetype distributionPhasetype distributionA phasetype distribution is a probability distribution that results from a system of one or more interrelated Poisson processes occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random...
 Lie product formula
 Baker–Campbell–Hausdorff formula
External links
 Matrix function






