QR decomposition

In linear algebra, a QR decomposition (also called a QR factorization) of a matrix is a decomposition of a matrix A into a product A = QR of an orthogonal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares problem, and is the basis for a particular eigenvalue algorithm, the QR algorithm.

If A has n linearly independent columns, then the first n columns of Q form an orthonormal basis for the column space of A. More specifically, the first k columns of Q form an orthonormal basis for the span of the first k columns of A for any 1 ≤ k ≤ n.[1] The fact that any column k of A only depends on the first k columns of Q is responsible for the triangular form of R.[1]

History

The QR algorithm for the computation of eigenvalues, which is based on the QR-decomposition, is considered to be one of the 10 most important algorithms of the 20th century.[2] It was independently discovered by an English computer scientist John G. F. Francis and Soviet mathematician Vera Kublanovskaya in 1961.

Definitions

Square matrix

Any real square matrix A may be decomposed as

${\displaystyle A=QR,\,}$

where Q is an orthogonal matrix (its columns are orthogonal unit vectors meaning QTQ = I) and R is an upper triangular matrix (also called right triangular matrix). This generalizes to a complex square matrix A and a unitary matrix Q (where Q*Q = I). If A is invertible, then the factorization is unique if we require that the diagonal elements of R are positive.

Rectangular matrix

More generally, we can factor a complex m×n matrix A, with m ≥ n, as the product of an m×m unitary matrix Q and an m×n upper triangular matrix R. As the bottom (mn) rows of an m×n upper triangular matrix consist entirely of zeroes, it is often useful to partition R, or both R and Q:

${\displaystyle A=QR=Q{\begin{bmatrix}R_{1}\\0\end{bmatrix}}={\begin{bmatrix}Q_{1},Q_{2}\end{bmatrix}}{\begin{bmatrix}R_{1}\\0\end{bmatrix}}=Q_{1}R_{1},}$

where R1 is an n×n upper triangular matrix, 0 is an (m − nn zero matrix, Q1 is m×n, Q2 is m×(m − n), and Q1 and Q2 both have orthogonal columns.

Template:Harvtxt call Q1R1 the thin QR factorization of A; Trefethen and Bau call this the reduced QR factorization.[1] If A is of full rank n and we require that the diagonal elements of R1 are positive then R1 and Q1 are unique, but in general Q2 is not. R1 is then equal to the upper triangular factor of the Cholesky decomposition of ATemplate:Starred A (= ATA if A is real).

QL, RQ and LQ decompositions

Analogously, we can define QL, RQ, and LQ decompositions, with L being a lower triangular matrix.

Computing the QR decomposition

There are several methods for actually computing the QR decomposition, such as by means of the Gram–Schmidt process, Householder transformations, or Givens rotations. Each has a number of advantages and disadvantages.

Using the Gram–Schmidt process

Define the projection:

${\displaystyle \mathrm {proj} _{\mathbf {e} }\mathbf {a} ={\frac {\left\langle \mathbf {e} ,\mathbf {a} \right\rangle }{\left\langle \mathbf {e} ,\mathbf {e} \right\rangle }}\mathbf {e} }$

then:

{\displaystyle {\begin{aligned}\mathbf {u} _{1}&=\mathbf {a} _{1},&\mathbf {e} _{1}&={\mathbf {u} _{1} \over \|\mathbf {u} _{1}\|}\\\mathbf {u} _{2}&=\mathbf {a} _{2}-\mathrm {proj} _{\mathbf {e} _{1}}\,\mathbf {a} _{2},&\mathbf {e} _{2}&={\mathbf {u} _{2} \over \|\mathbf {u} _{2}\|}\\\mathbf {u} _{3}&=\mathbf {a} _{3}-\mathrm {proj} _{\mathbf {e} _{1}}\,\mathbf {a} _{3}-\mathrm {proj} _{\mathbf {e} _{2}}\,\mathbf {a} _{3},&\mathbf {e} _{3}&={\mathbf {u} _{3} \over \|\mathbf {u} _{3}\|}\\&\vdots &&\vdots \\\mathbf {u} _{k}&=\mathbf {a} _{k}-\sum _{j=1}^{k-1}\mathrm {proj} _{\mathbf {e} _{j}}\,\mathbf {a} _{k},&\mathbf {e} _{k}&={\mathbf {u} _{k} \over \|\mathbf {u} _{k}\|}\end{aligned}}}

We then rearrange the equations above so that the ${\displaystyle \mathbf {a} _{i}}$s are on the left, using the fact that the ${\displaystyle \mathbf {e} _{i}}$ are unit vectors:

{\displaystyle {\begin{aligned}\mathbf {a} _{1}&=\langle \mathbf {e} _{1},\mathbf {a} _{1}\rangle \mathbf {e} _{1}\\\mathbf {a} _{2}&=\langle \mathbf {e} _{1},\mathbf {a} _{2}\rangle \mathbf {e} _{1}+\langle \mathbf {e} _{2},\mathbf {a} _{2}\rangle \mathbf {e} _{2}\\\mathbf {a} _{3}&=\langle \mathbf {e} _{1},\mathbf {a} _{3}\rangle \mathbf {e} _{1}+\langle \mathbf {e} _{2},\mathbf {a} _{3}\rangle \mathbf {e} _{2}+\langle \mathbf {e} _{3},\mathbf {a} _{3}\rangle \mathbf {e} _{3}\\&\vdots \\\mathbf {a} _{k}&=\sum _{j=1}^{k}\langle \mathbf {e} _{j},\mathbf {a} _{k}\rangle \mathbf {e} _{j}\end{aligned}}}

where ${\displaystyle \langle \mathbf {e} _{i},\mathbf {a} _{i}\rangle =\|\mathbf {u} _{i}\|}$. This can be written in matrix form:

${\displaystyle A=QR}$

where:

${\displaystyle Q=\left[\mathbf {e} _{1},\cdots ,\mathbf {e} _{n}\right]\qquad {\text{and}}\qquad R={\begin{pmatrix}\langle \mathbf {e} _{1},\mathbf {a} _{1}\rangle &\langle \mathbf {e} _{1},\mathbf {a} _{2}\rangle &\langle \mathbf {e} _{1},\mathbf {a} _{3}\rangle &\ldots \\0&\langle \mathbf {e} _{2},\mathbf {a} _{2}\rangle &\langle \mathbf {e} _{2},\mathbf {a} _{3}\rangle &\ldots \\0&0&\langle \mathbf {e} _{3},\mathbf {a} _{3}\rangle &\ldots \\\vdots &\vdots &\vdots &\ddots \end{pmatrix}}.}$

Example

Consider the decomposition of

${\displaystyle A={\begin{pmatrix}12&-51&4\\6&167&-68\\-4&24&-41\end{pmatrix}}.}$

Recall that an orthogonal matrix ${\displaystyle Q}$ has the property

${\displaystyle {\begin{matrix}Q^{T}\,Q=I.\end{matrix}}}$

Then, we can calculate ${\displaystyle Q}$ by means of Gram–Schmidt as follows:

${\displaystyle U={\begin{pmatrix}\mathbf {u} _{1}&\mathbf {u} _{2}&\mathbf {u} _{3}\end{pmatrix}}={\begin{pmatrix}12&-69&-58/5\\6&158&6/5\\-4&30&-33\end{pmatrix}};}$
${\displaystyle Q={\begin{pmatrix}{\frac {\mathbf {u} _{1}}{\|\mathbf {u} _{1}\|}}&{\frac {\mathbf {u} _{2}}{\|\mathbf {u} _{2}\|}}&{\frac {\mathbf {u} _{3}}{\|\mathbf {u} _{3}\|}}\end{pmatrix}}={\begin{pmatrix}6/7&-69/175&-58/175\\3/7&158/175&6/175\\-2/7&6/35&-33/35\end{pmatrix}}.}$

Thus, we have

${\displaystyle {\begin{matrix}Q^{T}A=Q^{T}Q\,R=R;\end{matrix}}}$
${\displaystyle {\begin{matrix}R=Q^{T}A=\end{matrix}}{\begin{pmatrix}14&21&-14\\0&175&-70\\0&0&35\end{pmatrix}}.}$

Relation to RQ decomposition

The RQ decomposition transforms a matrix A into the product of an upper triangular matrix R (also known as right-triangular) and an orthogonal matrix Q. The only difference from QR decomposition is the order of these matrices.

QR decomposition is Gram–Schmidt orthogonalization of columns of A, started from the first column.

RQ decomposition is Gram–Schmidt orthogonalization of rows of A, started from the last row.

Using Householder reflections

Householder reflection for QR-decomposition: The goal is to find a linear transformation that changes the vector ${\displaystyle x}$ into a vector of same length which is collinear to ${\displaystyle e_{1}}$. We could use an orthogonal projection (Gram-Schmidt) but this will be numerically unstable if the vectors ${\displaystyle x}$ and ${\displaystyle e_{1}}$ are close to orthogonal. Instead, the Householder reflection reflects through the dotted line (chosen to bisect the angle between ${\displaystyle x}$ and ${\displaystyle e_{1}}$). The maximum angle with this transform is at most 45 degrees.

A Householder reflection (or Householder transformation) is a transformation that takes a vector and reflects it about some plane or hyperplane. We can use this operation to calculate the QR factorization of an m-by-n matrix ${\displaystyle A}$ with m ≥ n.

Q can be used to reflect a vector in such a way that all coordinates but one disappear.

Let ${\displaystyle \mathbf {x} }$ be an arbitrary real m-dimensional column vector of ${\displaystyle A}$ such that ${\displaystyle \|\mathbf {x} \|=|\alpha |}$ for a scalar α. If the algorithm is implemented using floating-point arithmetic, then α should get the opposite sign as the k-th coordinate of ${\displaystyle \mathbf {x} }$, where ${\displaystyle x_{k}}$ is to be the pivot coordinate after which all entries are 0 in matrix A's final upper triangular form, to avoid loss of significance. In the complex case, set

${\displaystyle \alpha =-\mathrm {e} ^{\mathrm {i} \arg x_{k}}\|\mathbf {x} \|}$

Template:Harv and substitute transposition by conjugate transposition in the construction of Q below.

Then, where ${\displaystyle \mathbf {e} _{1}}$ is the vector (1,0,...,0)T, ||·|| is the Euclidean norm and ${\displaystyle I}$ is an m-by-m identity matrix, set

${\displaystyle \mathbf {u} =\mathbf {x} -\alpha \mathbf {e} _{1},}$
${\displaystyle \mathbf {v} ={\mathbf {u} \over \|\mathbf {u} \|},}$
${\displaystyle Q=I-2\mathbf {v} \mathbf {v} ^{T}.}$

Or, if ${\displaystyle A}$ is complex

${\displaystyle Q=I-(1+w)\mathbf {v} \mathbf {v} ^{H}}$, where ${\displaystyle w=\mathbf {x} ^{H}\mathbf {v} \mathbf {/} \mathbf {v} ^{H}\mathbf {x} }$
where ${\displaystyle \mathbf {x} ^{H}}$ is the conjugate transpose (transjugate) of ${\displaystyle \mathbf {x} }$

${\displaystyle Q}$ is an m-by-m Householder matrix and

${\displaystyle Q\mathbf {x} =(\alpha ,0,\cdots ,0)^{T}.\,}$

This can be used to gradually transform an m-by-n matrix A to upper triangular form. First, we multiply A with the Householder matrix Q1 we obtain when we choose the first matrix column for x. This results in a matrix Q1A with zeros in the left column (except for the first row).

${\displaystyle Q_{1}A={\begin{bmatrix}\alpha _{1}&\star &\dots &\star \\0&&&\\\vdots &&A'&\\0&&&\end{bmatrix}}}$

This can be repeated for A′ (obtained from Q1A by deleting the first row and first column), resulting in a Householder matrix Q2. Note that Q2 is smaller than Q1. Since we want it really to operate on Q1A instead of A′ we need to expand it to the upper left, filling in a 1, or in general:

${\displaystyle Q_{k}={\begin{pmatrix}I_{k-1}&0\\0&Q_{k}'\end{pmatrix}}.}$

After ${\displaystyle t}$ iterations of this process, ${\displaystyle t=\min(m-1,n)}$,

${\displaystyle R=Q_{t}\cdots Q_{2}Q_{1}A}$

is an upper triangular matrix. So, with

${\displaystyle Q=Q_{1}^{T}Q_{2}^{T}\cdots Q_{t}^{T},}$

This method has greater numerical stability than the Gram–Schmidt method above.

The following table gives the number of operations in the k-th step of the QR-decomposition by the Householder transformation, assuming a square matrix with size n.

Operation Number of operations in the k-th step
multiplications ${\displaystyle 2(n-k+1)^{2}}$
additions ${\displaystyle (n-k+1)^{2}+(n-k+1)(n-k)+2}$
division ${\displaystyle 1}$
square root ${\displaystyle 1}$

Summing these numbers over the n − 1 steps (for a square matrix of size n), the complexity of the algorithm (in terms of floating point multiplications) is given by

${\displaystyle {\frac {2}{3}}n^{3}+n^{2}+{\frac {1}{3}}n-2=O(n^{3}).}$

Example

Let us calculate the decomposition of

${\displaystyle A={\begin{pmatrix}12&-51&4\\6&167&-68\\-4&24&-41\end{pmatrix}}.}$

First, we need to find a reflection that transforms the first column of matrix A, vector ${\displaystyle \mathbf {a} _{1}=(12,6,-4)^{T}}$, to ${\displaystyle \|\mathbf {a} _{1}\|\;\mathrm {e} _{1}=(14,0,0)^{T}.}$

Now,

${\displaystyle \mathbf {u} =\mathbf {x} +\alpha \mathbf {e} _{1},}$

and

${\displaystyle \mathbf {v} ={\mathbf {u} \over \|\mathbf {u} \|}.}$

Here,

${\displaystyle \alpha =-14}$ and ${\displaystyle \mathbf {x} =\mathbf {a} _{1}=(12,6,-4)^{T}}$

Therefore

${\displaystyle {\mathbf {u} }=(-2,6,-4)^{T}=({2})(-1,3,-2)^{T}}$ and ${\displaystyle \mathbf {v} ={1 \over {\sqrt {14}}}(-1,3,-2)^{T}}$, and then
${\displaystyle Q_{1}=I-{2 \over {\sqrt {14}}{\sqrt {14}}}{\begin{pmatrix}-1\\3\\-2\end{pmatrix}}{\begin{pmatrix}-1&3&-2\end{pmatrix}}}$
${\displaystyle =I-{1 \over 7}{\begin{pmatrix}1&-3&2\\-3&9&-6\\2&-6&4\end{pmatrix}}}$
${\displaystyle ={\begin{pmatrix}6/7&3/7&-2/7\\3/7&-2/7&6/7\\-2/7&6/7&3/7\\\end{pmatrix}}.}$

Now observe:

${\displaystyle Q_{1}A={\begin{pmatrix}14&21&-14\\0&-49&-14\\0&168&-77\end{pmatrix}},}$

so we already have almost a triangular matrix. We only need to zero the (3, 2) entry.

Take the (1, 1) minor, and then apply the process again to

${\displaystyle A'=M_{11}={\begin{pmatrix}-49&-14\\168&-77\end{pmatrix}}.}$

By the same method as above, we obtain the matrix of the Householder transformation

${\displaystyle Q_{2}={\begin{pmatrix}1&0&0\\0&-7/25&24/25\\0&24/25&7/25\end{pmatrix}}}$

after performing a direct sum with 1 to make sure the next step in the process works properly.

Now, we find

${\displaystyle Q=Q_{1}^{T}Q_{2}^{T}={\begin{pmatrix}6/7&-69/175&58/175\\3/7&158/175&-6/175\\-2/7&6/35&33/35\end{pmatrix}}}$

Then

${\displaystyle Q=Q_{1}^{T}Q_{2}^{T}={\begin{pmatrix}0.8571&-0.3943&0.3314\\0.4286&0.9029&-0.0343\\-0.2857&0.1714&0.9429\end{pmatrix}}}$
${\displaystyle R=Q_{2}Q_{1}A=Q^{T}A={\begin{pmatrix}14&21&-14\\0&175&-70\\0&0&-35\end{pmatrix}}.}$

The matrix Q is orthogonal and R is upper triangular, so A = QR is the required QR-decomposition.

Using Givens rotations

QR decompositions can also be computed with a series of Givens rotations. Each rotation zeros an element in the subdiagonal of the matrix, forming the R matrix. The concatenation of all the Givens rotations forms the orthogonal Q matrix.

In practice, Givens rotations are not actually performed by building a whole matrix and doing a matrix multiplication. A Givens rotation procedure is used instead which does the equivalent of the sparse Givens matrix multiplication, without the extra work of handling the sparse elements. The Givens rotation procedure is useful in situations where only a relatively few off diagonal elements need to be zeroed, and is more easily parallelized than Householder transformations.

Example

Let us calculate the decomposition of

${\displaystyle A={\begin{pmatrix}12&-51&4\\6&167&-68\\-4&24&-41\end{pmatrix}}.}$

First, we need to form a rotation matrix that will zero the lowermost left element, ${\displaystyle \mathbf {a} _{31}=-4}$. We form this matrix using the Givens rotation method, and call the matrix ${\displaystyle G_{1}}$. We will first rotate the vector ${\displaystyle (6,-4)}$, to point along the X axis. This vector has an angle ${\displaystyle \theta =\arctan \left({-(-4) \over 6}\right)}$. We create the orthogonal Givens rotation matrix, ${\displaystyle G_{1}}$:

${\displaystyle G_{1}={\begin{pmatrix}1&0&0\\0&\cos(\theta )&-\sin(\theta )\\0&\sin(\theta )&\cos(\theta )\end{pmatrix}}}$
${\displaystyle \approx {\begin{pmatrix}1&0&0\\0&0.83205&-0.55470\\0&0.55470&0.83205\end{pmatrix}}}$

And the result of ${\displaystyle G_{1}A}$ now has a zero in the ${\displaystyle \mathbf {a} _{31}}$ element.

${\displaystyle G_{1}A\approx {\begin{pmatrix}12&-51&4\\7.21110&125.6396&-33.83671\\0&112.6041&-71.83368\end{pmatrix}}}$

We can similarly form Givens matrices ${\displaystyle G_{2}}$ and ${\displaystyle G_{3}}$, which will zero the sub-diagonal elements ${\displaystyle a_{21}}$ and ${\displaystyle a_{32}}$, forming a triangular matrix ${\displaystyle R}$. The orthogonal matrix ${\displaystyle Q^{T}}$ is formed from the concatenation of all the Givens matrices ${\displaystyle Q^{T}=G_{3}G_{2}G_{1}}$. Thus, we have ${\displaystyle G_{3}G_{2}G_{1}A=Q^{T}A=R}$, and the QR decomposition is ${\displaystyle A=QR}$.

Connection to a determinant or a product of eigenvalues

We can use QR decomposition to find the absolute value of the determinant of a square matrix. Suppose a matrix is decomposed as ${\displaystyle A=QR}$. Then we have

${\displaystyle \det(A)=\det(Q)\cdot \det(R).}$

Since Q is unitary, ${\displaystyle |\det(Q)|=1}$. Thus,

${\displaystyle |\det(A)|=|\det(R)|={\Big |}\prod _{i}r_{ii}{\Big |},}$

where ${\displaystyle r_{ii}}$ are the entries on the diagonal of R.

Furthermore, because the determinant equals the product of the eigenvalues, we have

${\displaystyle {\Big |}\prod _{i}r_{ii}{\Big |}={\Big |}\prod _{i}\lambda _{i}{\Big |},}$

We can extend the above properties to non-square complex matrix ${\displaystyle A}$ by introducing the definition of QR-decomposition for non-square complex matrix and replacing eigenvalues with singular values.

Suppose a QR decomposition for a non-square matrix A:

${\displaystyle A=Q{\begin{pmatrix}R\\O\end{pmatrix}},\qquad Q^{*}Q=I,}$

where ${\displaystyle O}$ is a zero matrix and ${\displaystyle Q}$ is a unitary matrix.

From the properties of SVD and determinant of matrix, we have

${\displaystyle {\Big |}\prod _{i}r_{ii}{\Big |}=\prod _{i}\sigma _{i},}$

Note that the singular values of ${\displaystyle A}$ and ${\displaystyle R}$ are identical, although their complex eigenvalues may be different. However, if A is square, the following is true:

${\displaystyle {\prod _{i}\sigma _{i}}={\Big |}{\prod _{i}\lambda _{i}}{\Big |}.}$

In conclusion, QR decomposition can be used efficiently to calculate the product of the eigenvalues or singular values of a matrix.

Column pivoting

Template:Expand section QR decomposition with column pivoting introduces a permutation matrix P:

${\displaystyle AP=QR\quad \iff A=QRP^{T}}$

Column pivoting is useful when A is (nearly) rank deficient, or is suspected of being so. It can also improve numerical accuracy. P is usually chosen so that the diagonal elements of R are non-increasing: ${\displaystyle |r_{11}|\geq |r_{22}|\geq \ldots \geq |r_{nn}|}$. This can be used to find the (numerical) rank of A at lower computational cost than a singular value decomposition, forming the basis of so-called rank-revealing QR algorithms.

Using for solution to linear inverse problems

Compared to the direct matrix inverse, inverse solutions using QR decomposition are more numerically stable as evidenced by their reduced condition numbers [Parker, Geophysical Inverse Theory, Ch1.13].

To solve the underdetermined (${\displaystyle m) linear problem ${\displaystyle Ax=b}$ where the matrix A has dimensions ${\displaystyle m\times n}$ and rank ${\displaystyle m}$, first find the QR factorization of the transpose of A: ${\displaystyle A^{T}=QR}$, where Q is an orthogonal matrix (i.e. ${\displaystyle Q^{T}=Q^{-1}}$), and R has a special form: ${\displaystyle R={\begin{bmatrix}R_{1}\\0\end{bmatrix}}}$. Here ${\displaystyle R_{1}}$ is a square ${\displaystyle m\times m}$ right triangular matrix, and the zero matrix has dimension ${\displaystyle (n-m)\times m}$. After some algebra, it can be shown that a solution to the inverse problem can be expressed as: ${\displaystyle x=Q{\begin{bmatrix}(R_{1}^{T})^{-1}b\\0\end{bmatrix}}}$ where one may either find ${\displaystyle R_{1}^{-1}}$ by Gaussian elimination or compute ${\displaystyle (R_{1}^{T})^{-1}b}$ directly by forward substitution. The latter technique enjoys greater numerical accuracy and lower computations.

To find a solution, ${\displaystyle {\hat {x}}}$, to the overdetermined (${\displaystyle m\geq n}$) problem ${\displaystyle Ax=b}$ which minimizes the norm ${\displaystyle \|A{\hat {x}}-b\|}$, first find the QR factorization of A: ${\displaystyle A=QR}$. The solution can then be expressed as ${\displaystyle {\hat {x}}=R_{1}^{-1}(Q_{1}^{T}b)}$, where ${\displaystyle Q_{1}}$ is an ${\displaystyle m\times n}$ matrix containing the first ${\displaystyle n}$ columns of the full orthonormal basis ${\displaystyle Q}$ and where ${\displaystyle R_{1}}$ is as before. Equivalent to the underdetermined case, back substitution can be used to quickly and accurately find this ${\displaystyle {\hat {x}}}$ without explicitly inverting ${\displaystyle R_{1}}$. (${\displaystyle Q_{1}}$ and ${\displaystyle R_{1}}$ are often provided by numerical libraries as an "economic" QR decomposition.)

References

1. L. N. Trefethen and D. Bau, Numerical Linear Algebra (SIAM, 1997).
2. Template:Cite doi
• {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}. Section 2.8.

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}.