# Matrix exponential

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. Abstractly, the matrix exponential gives the connection between a matrix Lie algebra and the corresponding Lie group.

Let Template:Mvar be an n×n real or complex matrix. The exponential of Template:Mvar, denoted by eX or exp(X), is the n×n matrix given by the power series

$e^{X}=\sum _{k=0}^{\infty }{1 \over k!}X^{k}.$ The above series always converges, so the exponential of Template:Mvar is well-defined. If Template:Mvar is a 1×1 matrix the matrix exponential of Template:Mvar is a 1×1 matrix whose single element is the ordinary exponential of the single element of Template:Mvar.

## Properties

Let X and Y be n×n complex matrices and let a and b be arbitrary complex numbers. We denote the n×n identity matrix by I and the zero matrix by 0. The matrix exponential satisfies the following properties:

### Linear differential equation systems

{{#invoke:main|main}}

One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linear ordinary differential equations. The solution of

${\frac {d}{dt}}y(t)=Ay(t),\quad y(0)=y_{0},$ where Template:Mvar is a constant matrix, is given by

$y(t)=e^{At}y_{0}.\,$ The matrix exponential can also be used to solve the inhomogeneous equation

${\frac {d}{dt}}y(t)=Ay(t)+z(t),\quad y(0)=y_{0}.$ See the section on applications below for examples.

There is no closed-form solution for differential equations of the form

${\frac {d}{dt}}y(t)=A(t)\,y(t),\quad y(0)=y_{0},$ where Template:Mvar is not constant, but the Magnus series gives the solution as an infinite sum.

### The exponential of sums

For any real numbers (scalars) Template:Mvar and Template:Mvar we know that the exponential function satisfies ex+y = ex ey. The same is true for commuting matrices, if the matrices Template:Mvar and Template:Mvar commute (meaning that XY = YX), then

$e^{X+Y}=e^{X}e^{Y}~.$ However, for matrices that do not commute the above equality does not necessarily hold. In this case the Baker–Campbell–Hausdorff formula can be used to calculate eX+Y.

The converse is not true in general. The equation eX+Y = eX eY does not imply that Template:Mvar and Template:Mvar commute.

For Hermitian matrices there are two notable theorems related to the trace of matrix exponentials.

#### Golden–Thompson inequality

{{#invoke:main|main}}

If Template:Mvar and Template:Mvar are Hermitian matrices, then

$\operatorname {tr} \exp(A+H)\leq \operatorname {tr} (\exp(A)\exp(H)).$ Note that there is no requirement of commutativity. There are counterexamples to show that the Golden–Thompson inequality cannot be extended to three matrices – and, in any event, tr(exp(A)exp(B)exp(C)) is not guaranteed to be real for Hermitian A, B, C. However, the next theorem accomplishes this in a way.

#### Lieb's theorem

The Lieb's theorem, named after Elliott H. Lieb, states that, for a fixed Hermitian matrix Template:Mvar, the function

$f(A)=\operatorname {tr} \,\exp \left(H+\log A\right)$ is concave on the cone of positive-definite matrices.

### The exponential map

Note that the exponential of a matrix is always an invertible matrix. The inverse matrix of eX is given by eX. This is analogous to the fact that the exponential of a complex number is always nonzero. The matrix exponential then gives us a map

$\exp \colon M_{n}(\mathbb {C} )\to \mathrm {GL} (n,\mathbb {C} )$ from the space of all n×n matrices to the general linear group of degree Template:Mvar, i.e. the group of all n×n invertible matrices. In fact, this map is surjective which means that every invertible matrix can be written as the exponential of some other matrix (for this, it is essential to consider the field C of complex numbers and not R).

For any two matrices Template:Mvar and Template:Mvar,

$\|e^{X+Y}-e^{X}\|\leq \|Y\|e^{\|X\|}e^{\|Y\|},$ where || · || denotes an arbitrary matrix norm. It follows that the exponential map is continuous and Lipschitz continuous on compact subsets of Mn(C).

The map

$t\mapsto e^{tX},\qquad t\in \mathbb {R}$ defines a smooth curve in the general linear group which passes through the identity element at t = 0.

In fact, this gives a one-parameter subgroup of the general linear group since

$e^{tX}e^{sX}=e^{(t+s)X}.\,$ The derivative of this curve (or tangent vector) at a point t is given by

${\frac {d}{dt}}e^{tX}=Xe^{tX}=e^{tX}X.\qquad (1)$ The derivative at t = 0 is just the matrix X, which is to say that X generates this one-parameter subgroup.

More generally, for a generic Template:Mvar-dependent exponent, X(t),

Taking the above expression eX(t) outside the integral sign and expanding the integrand with the help of the Hadamard lemma one can obtain the following useful expression for the derivative of the matrix exponent,

$\left({\frac {d}{dt}}e^{X(t)}\right)e^{-X(t)}={\frac {d}{dt}}X(t)+{\frac {1}{2!}}[X(t),{\frac {d}{dt}}X(t)]+{\frac {1}{3!}}[X(t),[X(t),{\frac {d}{dt}}X(t)]]+\cdots$ ### The determinant of the matrix exponential

By Jacobi's formula, for any complex square matrix the following trace identity holds:

In addition to providing a computational tool, this formula demonstrates that a matrix exponential is always an invertible matrix. This follows from the fact that the right hand side of the above equation is always non-zero, and so det(eA)≠ 0, which means that eA must be invertible.

In the real-valued case, the formula also exhibits the map

$\exp \colon M_{n}(\mathbb {R} )\to \mathrm {GL} (n,\mathbb {R} )$ to not be surjective, in contrast to the complex case mentioned earlier. This follows from the fact that, for real-valued matrices, the right-hand side of the formula is always positive, while there exist invertible matrices with a negative determinant.

## Computing the matrix exponential

Finding reliable and accurate methods to compute the matrix exponential is difficult, and this is still a topic of considerable current research in mathematics and numerical analysis. Both Matlab and GNU Octave use Padé approximant. Several methods are listed below.

### Diagonalizable case

If a matrix is diagonal:

$A={\begin{bmatrix}a_{1}&0&\ldots &0\\0&a_{2}&\ldots &0\\\vdots &\vdots &\ddots &\vdots \\0&0&\ldots &a_{n}\end{bmatrix}}$ ,

then its exponential can be obtained by exponentiating each entry on the main diagonal:

$e^{A}={\begin{bmatrix}e^{a_{1}}&0&\ldots &0\\0&e^{a_{2}}&\ldots &0\\\vdots &\vdots &\ddots &\vdots \\0&0&\ldots &e^{a_{n}}\end{bmatrix}}$ .

This also allows one to exponentiate diagonalizable matrices. If A = UDU−1 and D is diagonal, then eA = UeDU−1. Application of Sylvester's formula yields the same result. (To see this, note that addition and multiplication, hence also exponentiation, of diagonal matrices is equivalent to element-wise addition and multiplication, and hence exponentiation; in particular, the "one-dimensional" exponentiation is felt element-wise for the diagonal case.)

### Projection case

If P is a projection matrix (i.e. is idempotent), its matrix exponential is eP = I + (e − 1)P. This may be derived by expansion of the definition of the exponential function and by use of the idempotency of P:

$e^{P}=\sum _{k=0}^{\infty }{\frac {P^{k}}{k!}}=I+\left(\sum _{k=1}^{\infty }{\frac {1}{k!}}\right)P=I+(e-1)P~.$ ### Rotation case

For a simple rotation in which the perpendicular unit vectors Template:Mvar and Template:Mvar specify a plane, the rotation matrix Template:Mvar can be expressed in terms of a similar exponential function involving a generator Template:Mvar and angle Template:Mvar.

$G=ba^{\mathsf {T}}-ab^{\mathsf {T}}\qquad a^{\mathsf {T}}b=0$ $-G^{2}=aa^{\mathsf {T}}+bb^{\mathsf {T}}=P\qquad P^{2}=P\qquad PG=GP=G~,$ {\begin{aligned}R\left(\theta \right)&={{e}^{G\theta }}=I+G\sin(\theta )+{{G}^{2}}(1-\cos(\theta ))\\&=I-P+P\cos(\theta )+G\sin(\theta )~.\\\end{aligned}} The formula for the exponential results from reducing the powers of Template:Mvar in the series expansion and identifying the respective series coefficients of G2 and Template:Mvar with −cos(θ) and sin(θ) respectively. The second expression here for e is the same as the expression for R(θ) in the article containing the derivation of the generator, R(θ) = e.

$R(\theta )=\left({\begin{matrix}\cos(\theta )&-\sin(\theta )\\\sin(\theta )&\cos(\theta )\end{matrix}}\right)=I\cos(\theta )+G\sin(\theta )$ reduces to the standard matrix for a plane rotation.

The matrix P = −G2 projects a vector onto the ab-plane and the rotation only affects this part of the vector. An example illustrating this is a rotation of 30° = π/6 in the plane spanned by a and b,

Let N = IP, so N2 = N and its products with P and G are zero. This will allow us to evaluate powers of R.

### Nilpotent case

A matrix N is nilpotent if Nq = 0 for some integer q. In this case, the matrix exponential eN can be computed directly from the series expansion, as the series terminates after a finite number of terms:

$e^{N}=I+N+{\frac {1}{2}}N^{2}+{\frac {1}{6}}N^{3}+\cdots +{\frac {1}{(q-1)!}}N^{q-1}~.$ ### Generalization

When the minimal polynomial of a matrix X can be factored into a product of first degree polynomials, it can be expressed as a sum

$X=A+N\,$ where

• A is diagonalizable
• N is nilpotent
• A commutes with N (i.e. AN = NA)

This is the Jordan–Chevalley decomposition.

This means that we can compute the exponential of X by reducing to the previous two cases:

$e^{X}=e^{A+N}=e^{A}e^{N}.\,$ Note that we need the commutativity of A and N for the last step to work.

Another (closely related) method if the field is algebraically closed is to work with the Jordan form of X. Suppose that X = PJP −1 where J is the Jordan form of X. Then

$e^{X}=Pe^{J}P^{-1}.\,$ Also, since

$J=J_{a_{1}}(\lambda _{1})\oplus J_{a_{2}}(\lambda _{2})\oplus \cdots \oplus J_{a_{n}}(\lambda _{n}),$ {\begin{aligned}e^{J}&{}=\exp {\big (}J_{a_{1}}(\lambda _{1})\oplus J_{a_{2}}(\lambda _{2})\oplus \cdots \oplus J_{a_{n}}(\lambda _{n}){\big )}\\&{}=\exp {\big (}J_{a_{1}}(\lambda _{1}){\big )}\oplus \exp {\big (}J_{a_{2}}(\lambda _{2}){\big )}\oplus \cdots \oplus \exp {\big (}J_{a_{k}}(\lambda _{k}){\big )}.\end{aligned}} Therefore, we need only know how to compute the matrix exponential of a Jordan block. But each Jordan block is of the form

$J_{a}(\lambda )=\lambda I+N\,$ where N is a special nilpotent matrix. The matrix exponential of this block is given by

$e^{\lambda I+N}=e^{\lambda }e^{N}.\,$ ### Evaluation by Laurent series

By virtue of the Cayley–Hamilton theorem the matrix exponential is expressible as a polynomial of order Template:Mvar−1.

If Template:Mvar and Qt are nonzero polynomials in one variable, such that P(A) = 0, and if the meromorphic function

$f(z)={\frac {e^{tz}-Q_{t}(z)}{P(z)}}$ is entire, then

$e^{tA}=Q_{t}(A)$ .

To prove this, multiply the first of the two above equalities by P(z) and replace Template:Mvar by Template:Mvar.

Such a polynomial Qt(z) can be found as follows−−see Sylvester's formula. Letting Template:Mvar be a root of Template:Mvar, Qa,t(z) is solved from the product of Template:Mvar by the principal part of the Laurent series of Template:Mvar at Template:Mvar: It is proportional to the relevant Frobenius covariant. Then the sum St of the Qa,t, where Template:Mvar runs over all the roots of Template:Mvar, can be taken as a particular Qt. All the other Qt will be obtained by adding a multiple of Template:Mvar to St(z). In particular, St(z), the Lagrange-Sylvester polynomial, is the only Qt whose degree is less than that of Template:Mvar.

Example: Consider the case of an arbitrary 2-by-2 matrix,

$A:={\begin{bmatrix}a&b\\c&d\end{bmatrix}}.$ The exponential matrix etA, by virtue of the Cayley–Hamilton theorem, must be of the form

$e^{tA}=s_{0}(t)\,I+s_{1}(t)\,A$ .

(For any complex number Template:Mvar and any C-algebra Template:Mvar, we denote again by Template:Mvar the product of Template:Mvar by the unit of Template:Mvar.) Let Template:Mvar and Template:Mvar be the roots of the characteristic polynomial of Template:Mvar,

$P(z)=z^{2}-(a+d)\ z+ad-bc=(z-\alpha )(z-\beta )~.$ Then we have

$S_{t}(z)=e^{\alpha t}{\frac {z-\beta }{\alpha -\beta }}+e^{\beta t}{\frac {z-\alpha }{\beta -\alpha }}~,$ and hence

$s_{0}(t)={\frac {\alpha \,e^{\beta t}-\beta \,e^{\alpha t}}{\alpha -\beta }},\quad s_{1}(t)={\frac {e^{\alpha t}-e^{\beta t}}{\alpha -\beta }}\quad$ if αβ; while, if α = β,

$S_{t}(z)=e^{\alpha t}(1+t(z-\alpha ))~,$ so that

$s_{0}(t)=(1-\alpha \,t)\,e^{\alpha t},\quad s_{1}(t)=t\,e^{\alpha t}~.$ Defining

$s\equiv {\frac {\alpha +\beta }{2}}={\frac {\operatorname {tr} A}{2}}~,\qquad \qquad q\equiv {\frac {\alpha -\beta }{2}}=\pm {\sqrt {-\det \left(A-sI\right)}},$ we have

$s_{0}(t)=e^{st}\left(\cosh(qt)-s{\frac {\sinh(qt)}{q}}\right),\qquad s_{1}(t)=e^{st}{\frac {\sinh(qt)}{q}},$ where sin(qt)/q is 0 if Template:Mvar = 0, and Template:Mvar if Template:Mvar = 0. Thus,

Thus, as indicated above, the matrix Template:Mvar having decomposed into the sum of two mutually commuting pieces, the traceful piece and the traceless piece,

$A=sI+(A-sI)~,$ the matrix exponential reduces to a plain product of the exponentials of the two respective pieces. This is a formula often used in physics, as it amounts to the analog of Euler's formula for Pauli spin matrices, that is rotations of the doublet representation of the group SU(2).

The polynomial St can also be given the following "interpolation" characterization. Define et(z) ≡ etz, and Template:Mvar ≡ degTemplate:Mvar. Then St(z) is the unique degree < n polynomial which satisfies St(k)(a) = et(k)(a) whenever Template:Mvar is less than the multiplicity of Template:Mvar as a root of Template:Mvar. We assume, as we obviously can, that Template:Mvar is the minimal polynomial of Template:Mvar. We further assume that Template:Mvar is a diagonalizable matrix. In particular, the roots of Template:Mvar are simple, and the "interpolation" characterization indicates that St is given by the Lagrange interpolation formula, so it is the Lagrange−Sylvester polynomial .

At the other extreme, if P = (z−a)n, then

$S_{t}=e^{at}\ \sum _{k=0}^{n-1}\ {\frac {t^{k}}{k!}}\ (z-a)^{k}~.$ The simplest case not covered by the above observations is when $P=(z-a)^{2}\,(z-b)$ with ab, which yields

$S_{t}=e^{at}\ {\frac {z-b}{a-b}}\ {\Bigg (}1+\left(t+{\frac {1}{b-a}}\right)(z-a){\Bigg )}+e^{bt}\ {\frac {(z-a)^{2}}{(b-a)^{2}}}\quad .$ ### Evaluation by implementation of Sylvester's formula

A practical, expedited computation of the above reduces to the following rapid steps. Recall from above that an n-by-n matrix exp(tA) amounts to a linear combination of the first Template:Mvar−1 powers of Template:Mvar by the Cayley-Hamilton theorem. For diagonalizable matrices, as illustrated above, e.g. in the 2 by 2 case, Sylvester's formula yields exp(tA) = Bα exp()+Bβ exp(), where the Template:Mvars are the Frobenius covariants of Template:Mvar.

It is easiest, however, to simply solve for these Template:Mvars directly, by evaluating this expression and its first derivative at Template:Mvar=0, in terms of Template:Mvar and Template:Mvar, to find the same answer as above.

But this simple procedure also works for defective matrices, in a generalization due to Buchheim. This is illustrated here for a 4-by-4 example of a matrix which is not diagonalizable, and the Template:Mvars are not projection matrices.

Consider

$A={\begin{pmatrix}1&1&0&0\\0&1&1&0\\0&0&1&-1/8\\0&0&1/2&1/2\end{pmatrix}}~,$ with eigenvalues λ1=3/4 and λ2=1, each with a multiplicity of two.

Consider the exponential of each eigenvalue multiplied by Template:Mvar, exp(λit). Multiply each such by the corresponding undetermined coefficient matrix Bi. If the eigenvalues have an algebraic multiplicity greater than 1, then repeat the process, but now multiplying by an extra factor of Template:Mvar for each repetition, to ensure linear independence. (If one eigenvalue had a multiplicity of three, then there would be the three terms: $B_{i_{1}}e^{\lambda _{i}t},~B_{i_{2}}te^{\lambda _{i}t},~B_{i_{3}}t^{2}e^{\lambda _{i}t}$ . By contrast, when all eigenvalues are distinct, the Template:Mvars are just the Frobenius covariants, and solving for them as below just amounts to the inversion of the Vandermonde matrix of these 4 eigenvalues.)

Sum all such terms, here four such:

$e^{At}=B_{1_{1}}e^{\lambda _{1}t}+B_{1_{2}}te^{\lambda _{1}t}+B_{2_{1}}e^{\lambda _{2}t}+B_{2_{2}}te^{\lambda _{2}t},$ $e^{At}=B_{1_{1}}e^{3/4t}+B_{1_{2}}te^{3/4t}+B_{2_{1}}e^{1t}+B_{2_{2}}te^{1t}$ .

To solve for all of the unknown matrices Template:Mvar in terms of the first three powers of Template:Mvar and the identity, we need four equations, the above one providing one such at Template:Mvar =0. Further, differentiate it with respect to Template:Mvar,

$Ae^{At}=3/4B_{1_{1}}e^{3/4t}+\left(3/4t+1\right)B_{1_{2}}e^{3/4t}+1B_{2_{1}}e^{1t}+\left(1t+1\right)B_{2_{2}}e^{1t}~,$ and again,

{\begin{aligned}A^{2}e^{At}=&(3/4)^{2}B_{1_{1}}e^{3/4t}+\left((3/4)^{2}t+(3/4+1\cdot 3/4)\right)B_{1_{2}}e^{3/4t}+B_{2_{1}}e^{1t}\\+&\left(1^{2}t+(1+1\cdot 1)\right)B_{2_{2}}e^{1t}\\=&(3/4)^{2}B_{1_{1}}e^{3/4t}+\left((3/4)^{2}t+3/2\right)B_{1_{2}}e^{3/4t}+B_{2_{1}}e^{t}+\left(t+2\right)B_{2_{2}}e^{t}~,\end{aligned}} and once more,

{\begin{aligned}A^{3}e^{At}=&(3/4)^{3}B_{1_{1}}e^{3/4t}+\left((3/4)^{3}t+((3/4)^{2}+(3/2)\cdot 3/4))\right)B_{1_{2}}e^{3/4t}\\+&B_{2_{1}}e^{1t}+\left(1^{3}t+(1+2)\cdot 1\right)B_{2_{2}}e^{1t}\\=&(3/4)^{3}B_{1_{1}}e^{3/4t}\!+\left((3/4)^{3}t\!+27/16)\right)B_{1_{2}}e^{3/4t}\!+B_{2_{1}}e^{t}\!+\left(t+3\cdot 1\right)B_{2_{2}}e^{t}\end{aligned}} .

(In the general case, Template:Mvar−1 derivatives need be taken.)

Setting Template:Mvar=0 in these four equations, the four coefficient matrices Template:Mvars may be solved for,

{\begin{aligned}I=&B_{1_{1}}+B_{2_{1}}\\A=&3/4B_{1_{1}}+B_{1_{2}}+B_{2_{1}}+B_{2_{2}}\\A^{2}=&(3/4)^{2}B_{1_{1}}+(3/2)B_{1_{2}}+B_{2_{1}}+2B_{2_{2}}\\A^{3}=&(3/4)^{3}B_{1_{1}}+(27/16)B_{1_{2}}+B_{2_{1}}+3B_{2_{2}}\end{aligned}} ,

to yield

{\begin{aligned}B_{1_{1}}=&128A^{3}-366A^{2}+288A-80I\\B_{1_{2}}=&16A^{3}-44A^{2}+40A-12I\\B_{2_{1}}=&-128A^{3}+366A^{2}-288A+80I\\B_{2_{2}}=&16A^{3}-40A^{2}+33A-9I\end{aligned}} .

Substituting with the value for Template:Mvar yields the coefficient matrices

{\begin{aligned}B_{1_{1}}=&{\begin{pmatrix}0&0&48&-16\\0&0&-8&2\\0&0&1&0\\0&0&0&1\end{pmatrix}}\\B_{1_{2}}=&{\begin{pmatrix}0&0&4&-2\\0&0&-1&1/2\\0&0&1/4&-1/8\\0&0&1/2&-1/4\end{pmatrix}}\\B_{2_{1}}=&{\begin{pmatrix}1&0&-48&16\\0&1&8&-2\\0&0&0&0\\0&0&0&0\end{pmatrix}}\\B_{2_{2}}=&{\begin{pmatrix}0&1&8&-2\\0&0&0&0\\0&0&0&0\\0&0&0&0\end{pmatrix}}\end{aligned}} ${e}^{tA}\!=\!{\begin{pmatrix}{e}^{t}&t{e}^{t}&\left(8t-48\right){e}^{t}\!+\left(4t+48\right){e}^{3t/4}&\left(16-2\,t\right){e}^{t}\!+\left(-2t-16\right){e}^{3t/4}\\0&{e}^{t}&8{e}^{t}\!+\left(-t-8\right){e}^{3t/4}&-{\frac {4{e}^{t}+\left(-t-4\right){e}^{3t/4}}{2}}\\0&0&{\frac {\left(t+4\right){e}^{3t/4}}{4}}&-{\frac {t{e}^{3t/4}}{8}}\\0&0&{\frac {t{e}^{3t/4}}{2}}&-{\frac {\left(t-4\right){e}^{3t/4}}{4}}\end{pmatrix}}$ .

The procedure is quite shorter than Putzer's algorithm sometimes utilized in such cases.

## Illustrations

Suppose that we want to compute the exponential of

$B={\begin{bmatrix}21&17&6\\-5&-1&-6\\4&4&16\end{bmatrix}}.$ Its Jordan form is

$J=P^{-1}BP={\begin{bmatrix}4&0&0\\0&16&1\\0&0&16\end{bmatrix}},$ where the matrix P is given by

$P={\begin{bmatrix}-{\frac {1}{4}}&2&{\frac {5}{4}}\\{\frac {1}{4}}&-2&-{\frac {1}{4}}\\0&4&0\end{bmatrix}}.$ Let us first calculate exp(J). We have

$J=J_{1}(4)\oplus J_{2}(16)\,$ The exponential of a 1×1 matrix is just the exponential of the one entry of the matrix, so exp(J1(4)) = [e4]. The exponential of J2(16) can be calculated by the formula eI + N)eλ eN mentioned above; this yields

{\begin{aligned}\exp \left({\begin{bmatrix}16&1\\0&16\end{bmatrix}}\right)&=e^{16}\exp \left({\begin{bmatrix}0&1\\0&0\end{bmatrix}}\right)\\[6pt]&=e^{16}\left({\begin{bmatrix}1&0\\0&1\end{bmatrix}}+{\begin{bmatrix}0&1\\0&0\end{bmatrix}}+{1 \over 2!}{\begin{bmatrix}0&0\\0&0\end{bmatrix}}+\cdots \right)={\begin{bmatrix}e^{16}&e^{16}\\0&e^{16}\end{bmatrix}}.\end{aligned}} Therefore, the exponential of the original matrix B is