Grothendieck's Galois theory: Difference between revisions
en>Helpful Pixie Bot m ISBNs (Build KH) |
en>Reddyuday |
||
Line 1: | Line 1: | ||
{{distinguish2|the [[Schur complement method]] in numerical analysis.}} | |||
In [[linear algebra]] and the theory of [[matrix (mathematics)|matrices]], | |||
the '''Schur complement''' of a matrix block (i.e., a submatrix within a | |||
larger matrix) is defined as follows. | |||
Suppose ''A'', ''B'', ''C'', ''D'' are respectively | |||
''p''×''p'', ''p''×''q'', ''q''×''p'' | |||
and ''q''×''q'' matrices, and ''D'' is invertible. | |||
Let | |||
:<math>M=\left[\begin{matrix} A & B \\ C & D \end{matrix}\right]</math> | |||
so that ''M'' is a (''p''+''q'')×(''p''+''q'') matrix. | |||
Then the '''Schur complement''' of the block ''D'' of the | |||
matrix ''M'' is the ''p''×''p'' matrix | |||
:<math>A-BD^{-1}C.\,</math> | |||
It is named after [[Issai Schur]] who used it to prove [[Schur's lemma]], although it had been used previously.<ref>{{cite book |title=The Schur Complement and Its Applications |first=Fuzhen |last=Zhang |year=2005 |publisher=Springer| isbn=0-387-24271-6 |doi=10.1007/b105056}}</ref> Emilie Haynsworth was the first to call it the ''Schur complement''.<ref>Haynsworth, E. V., "On the Schur Complement", ''Basel Mathematical Notes'', #BNB 20, 17 pages, June 1968.</ref> | |||
==Background== | |||
The Schur complement arises as the result of performing a block [[Gaussian elimination]] by multiplying the matrix ''M'' from the right with the "block lower triangular" matrix | |||
:<math>L=\left[\begin{matrix} I_p & 0 \\ -D^{-1}C & I_q \end{matrix}\right].</math> | |||
Here ''I<sub>p</sub>'' denotes a ''p''×''p'' [[identity matrix]]. After multiplication with the matrix ''L'' the Schur complement appears in the upper ''p''×''p'' block. The product matrix is | |||
:<math> | |||
\begin{align} | |||
ML &= \left[\begin{matrix} A & B \\ C & D \end{matrix}\right]\left[\begin{matrix} I_p & 0 \\ -D^{-1}C & I_q \end{matrix}\right] = \left[\begin{matrix} A-BD^{-1}C & B \\ 0 & D \end{matrix}\right] \\ | |||
&= \left[\begin{matrix} I_p & BD^{-1} \\ 0 & I_q \end{matrix}\right] \left[\begin{matrix} A-BD^{-1}C & 0 \\ 0 & D \end{matrix}\right]. | |||
\end{align} | |||
</math> | |||
This is analogous to an [[LDU decomposition]]. That is, we have shown that | |||
:<math> | |||
\begin{align} | |||
\left[\begin{matrix} A & B \\ C & D \end{matrix}\right] &= \left[\begin{matrix} I_p & BD^{-1} \\ 0 & I_q \end{matrix}\right] \left[\begin{matrix} A-BD^{-1}C & 0 \\ 0 & D \end{matrix}\right] | |||
\left[ \begin{matrix} I_p & 0 \\ D^{-1}C & I_q \end{matrix}\right], | |||
\end{align} | |||
</math> | |||
and inverse of ''M'' thus may be expressed involving ''D''<sup>−1</sup> and the inverse of Schur's complement (if it exists) only as | |||
:<math> | |||
\begin{align} | |||
& {} \quad \left[ \begin{matrix} A & B \\ C & D \end{matrix}\right]^{-1} = | |||
\left[ \begin{matrix} I_p & 0 \\ -D^{-1}C & I_q \end{matrix}\right] | |||
\left[ \begin{matrix} (A-BD^{-1}C)^{-1} & 0 \\ 0 & D^{-1} \end{matrix}\right] | |||
\left[ \begin{matrix} I_p & -BD^{-1} \\ 0 & I_q \end{matrix}\right] \\[12pt] | |||
& = \left[ \begin{matrix} \left(A-B D^{-1} C \right)^{-1} & -\left(A-B D^{-1} C \right)^{-1} B D^{-1} \\ -D^{-1}C\left(A-B D^{-1} C \right)^{-1} & D^{-1}+ D^{-1} C \left(A-B D^{-1} C \right)^{-1} B D^{-1} \end{matrix} \right]. | |||
\end{align} | |||
</math> | |||
C.f. [[matrix inversion lemma]] which illustrates relationships between the above and the equivalent derivation with the roles of ''A'' and ''D'' interchanged. | |||
If ''M'' is a [[positive-definite matrix|positive-definite]] symmetric matrix, then so is the Schur complement of ''D'' in ''M''. | |||
If ''p'' and ''q'' are both 1 (i.e. ''A'', ''B'', ''C'' and ''D'' are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix: | |||
:<math> M^{-1} = \frac{1}{AD-BC} \left[ \begin{matrix} D & -B \\ -C & A \end{matrix}\right] </math> | |||
provided that [[determinant|''AD'' − ''BC'']] is non-zero. | |||
Moreover, the determinant of ''M'' is also clearly seen to be given by | |||
:<math> \det(M) = \det(D) \det(A - BD^{-1} C)</math> | |||
which generalizes the determinant formula for 2x2 matrices. | |||
== Application to solving linear equations == | |||
The Schur complement arises naturally in solving a system of linear equations such as | |||
:<math>Ax + By = a \, </math> | |||
:<math>Cx + Dy = b \, </math> | |||
where ''x'', ''a'' are ''p''-dimensional [[column vector]]s, ''y'', ''b'' are ''q''-dimensional column vectors, and ''A'', ''B'', ''C'', ''D'' are as above. Multiplying the bottom equation by <math>BD^{-1}</math> and then subtracting from the top equation one obtains | |||
:<math>(A - BD^{-1} C) x = a - BD^{-1} b.\,</math> | |||
Thus if one can invert ''D'' as well as the Schur complement of ''D'', one can solve for ''x'', and | |||
then by using the equation <math>Cx + Dy = b</math> one can solve for ''y''. This reduces the problem of | |||
inverting a <math>(p+q) \times (p+q)</math> matrix to that of inverting a ''p''×''p'' matrix and a ''q''×''q'' matrix. In practice one needs ''D'' to be [[Condition number|well-conditioned]] in order for this algorithm to be numerically accurate. | |||
==Applications to probability theory and statistics== | |||
Suppose the random column vectors ''X'', ''Y'' live in '''R'''<sup>''n''</sup> and '''R'''<sup>''m''</sup> respectively, and the vector (''X'', ''Y'') in '''R'''<sup>''n''+''m''</sup> has a [[multivariate normal distribution]] whose variance is the symmetric positive-definite matrix | |||
:<math>V=\left[\begin{matrix} A & B \\ B^T & C \end{matrix}\right],</math> | |||
where ''A'' is ''n''-by-''n'' and ''C'' is ''m''-by-''m''. | |||
Then the [[conditional variance]] of ''X'' given ''Y'' is the Schur complement of ''C'' in ''V'': | |||
:<math>\operatorname{var}(X\mid Y) = A-BC^{-1}B^T.</math> | |||
If we take the matrix ''V'' above to be, not a variance of a random vector, but a ''sample'' variance, then it may have a [[Wishart distribution]]. In that case, the Schur complement of ''C'' in ''V'' also has a Wishart distribution.{{Citation needed|date=January 2014}} | |||
== Schur complement condition for positive definiteness == | |||
Let ''X'' be a symmetric matrix given by | |||
:<math>X=\left[\begin{matrix} A & B \\ B^T & C \end{matrix}\right].</math> | |||
Let ''S'' be the Schur complement of ''A'' in ''X'', that is: | |||
:<math>S= C - B^T A^{-1} B . \, </math> | |||
Then | |||
* <math>X</math> is positive definite if and only if <math>A</math> and <math>S</math> are both positive definite: | |||
:<math>X \succ 0 \Leftrightarrow A \succ 0, S = C - B^T A^{-1} B \succ 0</math>. | |||
* <math>X</math> is positive definite if and only if <math>C</math> and <math>A - B C^{-1} B^T</math> are both positive definite: | |||
:<math>X \succ 0 \Leftrightarrow C \succ 0, A - B C^{-1} B^T \succ 0</math>. | |||
* If <math>A</math> is positive definite, then <math>X</math> is positive semidefinite if and only if <math>S</math> is positive semidefinite: | |||
:<math>\text{If}</math> <math>A \succ 0</math>, <math>\text{then}</math> <math>X \succeq 0 \Leftrightarrow S = C - B^T A^{-1} B \succeq 0</math>. | |||
* If <math>C</math> is positive definite, then <math>X</math> is positive semidefinite if and only if <math>A - B C^{-1} B^T</math> is positive semidefinite: | |||
:<math>\text{If}</math> <math>C \succ 0</math>, <math>\text{then}</math> <math>X \succeq 0 \Leftrightarrow A - B C^{-1} B^T \succeq 0</math>. | |||
The first and third statements can be derived<ref>Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)</ref> by considering the minimizer of the quantity | |||
:<math> u^T A u + 2 v^T B^T u + v^T C v, \,</math> | |||
as a function of ''v'' (for fixed ''u''). | |||
== See also == | |||
* [[Woodbury matrix identity]] | |||
* [[Quasi-Newton method]] | |||
* [[Haynsworth inertia additivity formula]] | |||
== References == | |||
{{reflist}} | |||
[[Category:Linear algebra]] |
Revision as of 00:43, 24 August 2013
In linear algebra and the theory of matrices, the Schur complement of a matrix block (i.e., a submatrix within a larger matrix) is defined as follows. Suppose A, B, C, D are respectively p×p, p×q, q×p and q×q matrices, and D is invertible. Let
so that M is a (p+q)×(p+q) matrix.
Then the Schur complement of the block D of the matrix M is the p×p matrix
It is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously.[1] Emilie Haynsworth was the first to call it the Schur complement.[2]
Background
The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with the "block lower triangular" matrix
Here Ip denotes a p×p identity matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is
This is analogous to an LDU decomposition. That is, we have shown that
and inverse of M thus may be expressed involving D−1 and the inverse of Schur's complement (if it exists) only as
C.f. matrix inversion lemma which illustrates relationships between the above and the equivalent derivation with the roles of A and D interchanged.
If M is a positive-definite symmetric matrix, then so is the Schur complement of D in M.
If p and q are both 1 (i.e. A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:
provided that AD − BC is non-zero.
Moreover, the determinant of M is also clearly seen to be given by
which generalizes the determinant formula for 2x2 matrices.
Application to solving linear equations
The Schur complement arises naturally in solving a system of linear equations such as
where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, and A, B, C, D are as above. Multiplying the bottom equation by and then subtracting from the top equation one obtains
Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation one can solve for y. This reduces the problem of inverting a matrix to that of inverting a p×p matrix and a q×q matrix. In practice one needs D to be well-conditioned in order for this algorithm to be numerically accurate.
Applications to probability theory and statistics
Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn+m has a multivariate normal distribution whose variance is the symmetric positive-definite matrix
where A is n-by-n and C is m-by-m.
Then the conditional variance of X given Y is the Schur complement of C in V:
If we take the matrix V above to be, not a variance of a random vector, but a sample variance, then it may have a Wishart distribution. In that case, the Schur complement of C in V also has a Wishart distribution.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.
Schur complement condition for positive definiteness
Let X be a symmetric matrix given by
Let S be the Schur complement of A in X, that is:
Then
The first and third statements can be derived[3] by considering the minimizer of the quantity
as a function of v (for fixed u).
See also
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
- ↑ 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - ↑ Haynsworth, E. V., "On the Schur Complement", Basel Mathematical Notes, #BNB 20, 17 pages, June 1968.
- ↑ Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)