Filtered category: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
No edit summary
changed "cone" to "cocone"
 
Line 1: Line 1:
In mathematics, the '''generalized minimal residual method''' (usually abbreviated '''GMRES''') is an [[iterative method]] for the [[numerical analysis|numerical]] solution of a nonsymmetric [[system of linear equations]]. The method approximates the solution by the vector in a [[Krylov subspace]] with minimal [[residual (numerical analysis)|residual]]. The [[Arnoldi iteration]] is used to find this vector.
I am Oscar and I completely dig that title. It's not a common thing but what she likes doing is base jumping and now she is attempting to earn cash with it. South Dakota is her birth place but she requirements to move because of her family members. He utilized to be unemployed but now he is a meter reader.<br><br>my webpage :: [http://www.buzzbit.net/blog/333162 std testing at home]
 
The GMRES method was developed by [[Yousef Saad]] and Martin H. Schultz in 1986.<ref>Saad and Schultz</ref>
GMRES is a generalization of the [[MINRES]] method developed by Chris Paige and Michael Saunders in 1975. GMRES also is a special case of the [[DIIS]] method developed by Peter Pulay in 1980. DIIS is also applicable to non-linear systems.
 
== The method ==
 
Denote the Euclidean norm of any vector ''v'' by <math>\|v\|</math>. Denote the system of linear equations to be solved by
:<math> Ax = b. \, </math>
The matrix ''A'' is assumed to be [[invertible matrix|invertible]] of size ''m''-by-''m''. Furthermore, it is assumed that ''b'' is normalized, i.e., that ||''b''|| = 1.
 
The ''n''th [[Krylov_sequence|Krylov subspace]] for this problem is
:<math> K_n = K_n(A,b) = \operatorname{span} \, \{ b, Ab, A^2b, \ldots, A^{n-1}b \}. \, </math>
GMRES approximates the exact solution of ''Ax'' = ''b'' by the vector ''x''<sub>''n''</sub> ∈ ''K''<sub>''n''</sub> that minimizes the Euclidean norm of the [[Residual_(numerical_analysis)| residual]] ''Ax''<sub>''n''</sub> &minus; ''b''.
 
The vectors ''b'', ''Ab'', …, ''A''<sup>''n''&minus;1</sup>''b'' might be almost [[linear independence| linearly dependent]], so instead of this basis, the [[Arnoldi iteration]] is used to find orthonormal vectors
:<math> q_1, q_2, \ldots, q_n \, </math>
which form a basis for ''K''<sub>''n''</sub>. Hence, the vector ''x''<sub>''n''</sub> ∈ ''K''<sub>''n''</sub> can be written as ''x''<sub>''n''</sub> = ''Q''<sub>''n''</sub>''y''<sub>''n''</sub> with ''y''<sub>''n''</sub> ∈ '''R'''<sup>''n''</sup>, where ''Q''<sub>''n''</sub> is the ''m''-by-''n'' matrix formed by ''q''<sub>1</sub>, …, ''q''<sub>n</sub>.
 
The Arnoldi process also produces an (''n''+1)-by-''n'' upper [[Hessenberg matrix]] <math>\tilde{H}_n</math> with
:<math> AQ_n = Q_{n+1} \tilde{H}_n. \, </math>
Because <math>Q_n</math> is orthogonal, we have
:<math> \| Ax_n - b \| = \| \tilde{H}_ny_n - \beta e_1 \|, \, </math>
where
:<math> e_1 = (1,0,0,\ldots,0) \, </math>
is the first vector in the [[standard basis]] of '''R'''<sup>''n''+1</sup>, and
:<math> \beta = \|b-Ax_0\| \, ,</math>
<math>x_0</math> being the first trial vector (usually zero). Hence, <math>x_n</math> can be found by minimizing the Euclidean norm of the residual
:<math> r_n = \tilde{H}_n y_n - \beta e_1. </math>
This is a [[linear least squares (mathematics)|linear least squares]] problem of size ''n''.
 
This yields the GMRES method. At every step of the iteration:
# do one step of the Arnoldi method;
# find the <math> y_n </math> which minimizes ||''r''<sub>''n''</sub>||;
# compute <math> x_n = Q_n y_n </math>;
# repeat if the residual is not yet small enough.
At every iteration, a matrix-vector product ''Aq''<sub>''n''</sub> must be computed. This costs about 2''m''<sup>2</sup> [[floating point|floating-point operations]] for general dense matrices of size ''m'', but the cost can decrease to O(''m'') for [[sparse matrix|sparse matrices]]. In addition to the matrix-vector product, O(''n'' ''m'') floating-point operations must be computed at the ''n''th iteration.
 
== Convergence ==
 
The ''n''th iterate minimizes the residual in the Krylov subspace ''K''<sub>''n''</sub>. Since every subspace is contained in the next subspace, the residual decreases monotonically. After ''m'' iterations, where ''m'' is the size of the matrix ''A'', the Krylov space ''K''<sub>''m''</sub> is the whole of '''R'''<sup>''m''</sup> and hence the GMRES method arrives at the exact solution. However, the idea is that after a small number of iterations (relative to ''m''), the vector ''x''<sub>''n''</sub> is already a good approximation to the
exact solution.
 
This does not happen in general. Indeed, a theorem of Greenbaum, Pták and Strakoš states that for every monotonically decreasing sequence ''a''<sub>1</sub>, …, ''a''<sub>''m''&minus;1</sub>, ''a''<sub>''m''</sub> = 0, one can find a matrix ''A'' such that the ||''r''<sub>''n''</sub>|| = ''a''<sub>''n''</sub> for all ''n'', where ''r''<sub>''n''</sub> is the residual defined above. In particular, it is possible to find a matrix for which the residual stays constant for ''m''&nbsp;&minus;&nbsp;1 iterations, and only drops to zero at the last iteration.
 
In practice, though, GMRES often performs well. This can be proven in specific situations. If ''A'' is [[positive-definite matrix|positive definite]], then
:<math> \|r_n\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{n/2} \|r_0\|, </math>
where <math>\lambda_{\mathrm{min}}(M)</math> and <math>\lambda_{\mathrm{max}}(M)</math> denote the smallest and largest [[eigenvalue]] of the matrix <math>M</math>, respectively.
 
If ''A'' is [[symmetric matrix|symmetric]] and positive definite, then we even have
:<math> \|r_n\| \leq \left( \frac{2\kappa_2(A)-1}{2\kappa_2(A)} \right)^{n/2} \|r_0\|. </math>
where <math>\kappa_2(A)</math> denotes the [[condition number]] of ''A'' in the Euclidean norm.
 
In the general case, where ''A'' is not positive definite, we have
:<math> \|r_n\| \le \inf_{p \in P_n} \|p(A)\| \le \kappa_2(V) \inf_{p \in P_n} \max_{\lambda \in \sigma(A)} |p(\lambda)| \|r_0\|, \, </math>
where ''P''<sub>''n''</sub> denotes the set of polynomials of degree at most ''n'' with ''p''(0) = 1, ''V'' is the matrix appearing in the [[spectral decomposition]] of ''A'', and σ(''A'') is the [[spectrum of a matrix|spectrum]] of ''A''. Roughly speaking, this says that fast convergence occurs when the eigenvalues of ''A'' are clustered away from the origin and ''A'' is not too far from [[normal matrix|normality]].<ref>Trefethen & Bau, Thm 35.2</ref>
 
All these inequalities bound only the residuals instead of the actual error, that is, the distance between the current iterate ''x''<sub>''n''</sub> and the exact solution.
 
== Extensions of the method ==
 
Like other iterative methods, GMRES is usually combined with a [[preconditioning]] method in order to speed up convergence.
 
The cost of the iterations grow as O(''n''<sup>2</sup>), where ''n'' is the iteration number. Therefore, the method is sometimes restarted after a number, say ''k'', of iterations, with ''x''<sub>''k''</sub> as initial guess. The resulting method is called GMRES(''k'') or Restarted GMRES.
 
== Comparison with other solvers ==
 
The Arnoldi iteration reduces to the [[Lanczos iteration]] for symmetric matrices. The corresponding Krylov subspace method is the minimal residual method (MinRes) of Paige and Saunders. Unlike the unsymmetric case, the MinRes method is given by a three-term recurrence relation. It can be shown that there is no Krylov subspace method for general matrices, which is given by a short recurrence relation and yet minimizes the norms of the residuals, as GMRES does.
 
Another class of methods builds on the [[unsymmetric Lanczos iteration]], in particular the [[Biconjugate gradient method|BiCG method]]. These use a three-term recurrence relation, but they do not attain the minimum residual, and hence the residual does not decrease monotonically for these methods. Convergence is not even guaranteed.
 
The third class is formed by methods like [[Conjugate gradient squared method|CGS]] and [[Biconjugate gradient stabilized method|BiCGSTAB]]. These also work with a three-term recurrence relation (hence, without optimality) and they can even terminate prematurely without achieving convergence. The idea behind these methods is to choose the generating polynomials of the iteration sequence suitably.
 
None of these three classes is the best for all matrices; there are always examples in which one class outperforms the other. Therefore, multiple solvers are tried in practice to see which one is the best for a given problem.
 
== Solving the least squares problem ==
 
One part of the GMRES method is to find the vector <math>y_n</math> which minimizes
:<math> \| \tilde{H}_n y_n - \beta e_1 \|. \, </math>
Note that <math>\tilde{H}_n</math> is an (''n''+1)-by-''n'' matrix, hence it gives an over-constrained linear system of ''n''+1 equations for ''n'' unknowns.
 
The minimum can be computed using a [[QR decomposition]]: find an (''n''+1)-by-(''n''+1) [[orthogonal matrix]] &Omega;<sub>''n''</sub> and an (''n''+1)-by-''n'' upper [[triangular matrix]] <math>\tilde{R}_n</math> such that
:<math> \Omega_n \tilde{H}_n = \tilde{R}_n. </math>
The triangular matrix has one more row than it has columns, so its bottom row consists of zero. Hence, it can be decomposed as
:<math> \tilde{R}_n = \begin{bmatrix} R_n \\ 0 \end{bmatrix}, </math>
where <math>R_n</math> is an ''n''-by-''n'' (thus square) triangular matrix.
 
The QR decomposition can be updated cheaply from one iteration to the next, because the Hessenberg matrices differ only by a row of zeros and a column:
:<math>\tilde{H}_{n+1} = \begin{bmatrix} \tilde{H}_n & h_{n+1} \\ 0 & h_{n+2,n+1} \end{bmatrix}, </math>
where ''h''<sub>''n+1''</sub> = (''h''<sub>1,''n+1''</sub>, &hellip;, ''h''<sub>''n+2,n+1''</sub>)<sup>T</sup>. This implies that premultiplying the Hessenberg matrix with &Omega;<sub>''n''</sub>, augmented with zeroes and a row with multiplicative identity, yields almost a triangular matrix:
:<math> \begin{bmatrix} \Omega_n & 0 \\ 0 & 1 \end{bmatrix} \tilde{H}_{n+1} = \begin{bmatrix} R_n & r_{n+1} \\ 0 & \rho \\ 0 & \sigma \end{bmatrix} </math>
This would be triangular if &sigma; is zero. To remedy this, one needs the [[Givens rotation]]
:<math> G_n = \begin{bmatrix} I_{n} & 0 & 0 \\ 0 & c_n & s_n \\ 0 & -s_n & c_n \end{bmatrix} </math>
where
:<math> c_n = \frac{\rho}{\sqrt{\rho^2+\sigma^2}} \quad\mbox{and}\quad s_n = \frac{\sigma}{\sqrt{\rho^2+\sigma^2}}. </math>
With this Givens rotation, we form
:<math> \Omega_{n+1} = G_n \begin{bmatrix} \Omega_n & 0 \\ 0 & 1 \end{bmatrix}. </math>
Indeed,
:<math> \Omega_{n+1} \tilde{H}_{n+1} = \begin{bmatrix} R_n & r_{n+1} \\ 0 & r_{n+1,n+1} \\ 0 & 0 \end{bmatrix} \quad\text{with}\quad r_{n+1,n+1} = \sqrt{\rho^2+\sigma^2} </math>
is a triangular matrix.
 
Given the QR decomposition, the minimization problem is easily solved by noting that
:<math> \| \tilde{H}_n y_n - \beta e_1 \| = \| \Omega_n (\tilde{H}_n y_n - \beta e_1) \| = \| \tilde{R}_n y_n - \beta \Omega_n e_1 \|. </math>
Denoting the vector <math>\beta\Omega_ne_1</math> by
:<math> \tilde{g}_n = \begin{bmatrix} g_n \\ \gamma_n \end{bmatrix} </math>
with ''g''<sub>''n''</sub> &isin; '''R'''<sup>''n''</sub> and &gamma;<sub>''n''</sub> &isin; '''R''', this is
:<math> \| \tilde{H}_n y_n - \beta e_1 \| = \| \tilde{R}_n y_n - \beta \Omega_n e_1 \| = \left\| \begin{bmatrix} R_n \\ 0 \end{bmatrix} y - \begin{bmatrix} g_n \\ \gamma_n \end{bmatrix} \right\|. </math>
The vector ''y'' that minimizes this expression is given by
:<math> y_n = R_n^{-1} g_n. </math>
Again, the vectors <math>g_n</math> are easy to update.<ref>Stoer and Bulirsch, §8.7.2</ref>
 
== See also==
* [[Biconjugate gradient method]]
 
== Notes ==
<references/>
 
== References ==
 
* A. Meister, ''Numerik linearer Gleichungssysteme'', 2nd edition, Vieweg 2005, ISBN 978-3-528-13135-7.
* Y. Saad, ''Iterative Methods for Sparse Linear Systems'', 2nd edition, [[Society for Industrial and Applied Mathematics]], 2003. ISBN 978-0-89871-534-7.
* Y. Saad and M.H. Schultz, "GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems", ''SIAM J. Sci. Stat. Comput.'', '''7''':856-869, 1986. {{doi|10.1137/0907058}}.
* J. Stoer and R. Bulirsch, ''Introduction to numerical analysis'', 3rd edition, Springer, New York, 2002. ISBN 978-0-387-95452-3.
* Lloyd N. Trefethen and David Bau, III, ''Numerical Linear Algebra'', Society for Industrial and Applied Mathematics, 1997. ISBN 978-0-89871-361-9.
* [http://www.netlib.org/linalg/html_templates/node29.html#SECTION00734000000000000000|J. Dongarra et al. , ''Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods''], 2nd Edition, SIAM, Philadelphia, 1994
 
{{Numerical linear algebra}}
 
[[Category:Numerical linear algebra]]
[[Category:Articles with example pseudocode]]

Latest revision as of 11:38, 15 May 2014

I am Oscar and I completely dig that title. It's not a common thing but what she likes doing is base jumping and now she is attempting to earn cash with it. South Dakota is her birth place but she requirements to move because of her family members. He utilized to be unemployed but now he is a meter reader.

my webpage :: std testing at home