Surgery theory: Difference between revisions
en>ImageRemovalBot Removing links to deleted file File:Surgery and morse function.png |
en>WangPublic m Add link. |
||
Line 1: | Line 1: | ||
In [[mathematics]], the '''power iteration''' is an [[eigenvalue algorithm]]: given a [[matrix (mathematics)|matrix]] ''A'', the algorithm will produce a number λ (the [[eigenvalue]]) and a nonzero vector ''v'' (the eigenvector), such that ''Av'' = λ''v''. | |||
The algorithm is also known as the Von Mises iteration.<ref name=VonMises>R. von Mises and H. Pollaczek-Geiringer, | |||
''Praktische Verfahren der Gleichungsauflösung'', ZAMM - Zeitschrift für Angewandte Mathematik und Mechanik 9, 152-164 (1929).</ref> | |||
The power iteration is a very simple algorithm. It does not compute a [[matrix decomposition]], and hence it can be used when ''A'' is a very large [[sparse matrix]]. However, it will find only one eigenvalue (the one with the greatest [[absolute value]]) and it may converge only slowly. | |||
==The method== | |||
The power iteration algorithm starts with a vector ''b''<sub>0</sub>, which may be an approximation to the dominant eigenvector or a random vector. The method is described by the iteration | |||
:<math> b_{k+1} = \frac{Ab_k}{\|Ab_k\|}. </math> | |||
So, at every iteration, the vector ''b''<sub>''k''</sub> is multiplied by the matrix ''A'' and normalized. | |||
Under the assumptions: | |||
*''A'' has an eigenvalue that is strictly greater in magnitude than its other eigenvalues | |||
*The starting vector <math>b_{0}</math> has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue. | |||
then: | |||
*A subsequence of <math>\left( b_{k} \right)</math> converges to an eigenvector associated with the dominant eigenvalue | |||
Note that the sequence <math>\left( b_{k} \right)</math> does not necessarily converge. It can be shown that: | |||
<br> | |||
<math>b_{k} = e^{i \phi_{k}} v_{1} + r_{k}</math> | |||
where: <math>v_{1}</math> is an eigenvector associated with the dominant eigenvalue, and <math> \| r_{k} \| \rightarrow 0</math>. The presence of the term <math>e^{i \phi_{k}}</math> | |||
implies that <math>\left( b_{k} \right) </math> | |||
does not converge unless <math>e^{i \phi_{k}} = 1</math> | |||
Under the two assumptions listed above, the sequence <math>\left( \mu_{k} \right)</math> defined by: | |||
<math>\mu_{k} = \frac{b_{k}^{*}Ab_{k}}{b_{k}^{*}b_{k}}</math> | |||
converges to the dominant eigenvalue. | |||
This can be run as a simulation program with the following simple algorithm: | |||
<source lang="cpp"> | |||
for each(''simulation'') { | |||
// calculate the matrix-by-vector product Ab | |||
for(i=0; i<n; i++) { | |||
tmp[i] = 0; | |||
for (j=0; j<n; j++) | |||
tmp[i] += A[i][j] * b[j]; // dot product of ith col in A with b | |||
} | |||
// calculate the length of the resultant vector | |||
norm_sq=0; | |||
for (k=0; k<n; k++) | |||
norm_sq += tmp[k]*tmp[k]; | |||
norm = sqrt(norm_sq); | |||
// normalize b to unit vector for next iteration | |||
b = tmp/norm; | |||
} | |||
</source> | |||
The value of ''norm'' converges to the absolute value of the dominant eigenvalue, and the vector ''b'' to an associated eigenvector. | |||
''Note:'' The above code assumes real A,b. To handle complex; A[i][j] becomes conj(A[i][j]), and tmp[k]*tmp[k] becomes conj(tmp[k])*tmp[k] | |||
This algorithm is the one used to calculate such things as the Google [[PageRank]]. | |||
The method can also be used to calculate the [[spectral radius]] of a matrix by computing the [[Rayleigh quotient]] | |||
:<math> \frac{b_k^\top A b_k}{b_k^\top b_k} = \frac{b_{k+1}^\top b_k}{b_k^\top b_k}. </math> | |||
==Analysis== | |||
Let <math>A</math> be decomposed into its [[Jordan canonical form]]: <math>A=VJV^{-1}</math>, | |||
where the first column of <math>V</math> is an eigenvector of <math>A</math> corresponding to the dominant eigenvalue <math>\lambda_{1}</math>. | |||
Since the dominant eigenvalue of <math>A</math> is unique, | |||
the first Jordan block of <math>J</math> is the <math>1 \times 1</math> matrix | |||
<math>\begin{bmatrix} \lambda_{1} \end{bmatrix} </math>, where | |||
<math>\lambda_{1}</math> is the largest eigenvalue of ''A'' in magnitude. | |||
The starting vector <math>b_{0}</math> | |||
can be written as a linear combination of the columns of V: | |||
<math>b_{0} = c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n}</math>. | |||
By assumption, <math>b_{0}</math> has a nonzero component in the direction of the dominant eigenvalue, so | |||
<math>c_{1} \ne 0</math>. | |||
The computationally useful [[recurrence relation]] for <math>b_{k+1}</math> | |||
can be rewritten as: | |||
<math>b_{k+1}=\frac{Ab_{k}}{\|Ab_{k}\|}=\frac{A^{k+1}b_{0}}{\|A^{k+1}b_{0}\|}</math>, | |||
where the expression: <math>\frac{A^{k+1}b_{0}}{\|A^{k+1}b_{0}\|}</math> is more amenable to the following analysis. | |||
<br> | |||
<math>\displaystyle | |||
\begin{array}{lcl} | |||
b_{k} &=& \frac{A^{k}b_{0}}{\| A^{k} b_{0} \|} \\ | |||
&=& \frac{\left( VJV^{-1} \right)^{k} b_{0}}{\|\left( VJV^{-1} \right)^{k}b_{0}\|} \\ | |||
&=& \frac{ VJ^{k}V^{-1} b_{0}}{\| V J^{k} V^{-1} b_{0}\|} \\ | |||
&=& \frac{ VJ^{k}V^{-1} \left( c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n} \right)} | |||
{\| V J^{k} V^{-1} \left( c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n} \right)\|} \\ | |||
&=& \frac{ VJ^{k}\left( c_{1}e_{1} + c_{2}e_{2} + \cdots + c_{n}e_{n} \right)} | |||
{\| V J^{k} \left( c_{1}e_{1} + c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \|} \\ | |||
&=& \left( \frac{\lambda_{1}}{|\lambda_{1}|} \right)^{k} \frac{c_{1}}{|c_{1}|} | |||
\frac{ v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} | |||
\left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right)} | |||
{\| v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} | |||
\left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \| } | |||
\end{array} | |||
</math> | |||
<br> | |||
The expression above simplifies as <math>k \rightarrow \infty </math> | |||
<br> | |||
<math> | |||
\left( \frac{1}{\lambda_{1}} J \right)^{k} = | |||
\begin{bmatrix} | |||
[1] & & & & \\ | |||
& \left( \frac{1}{\lambda_{1}} J_{2} \right)^{k}& & & \\ | |||
& & \ddots & \\ | |||
& & & \left( \frac{1}{\lambda_{1}} J_{m} \right)^{k} \\ | |||
\end{bmatrix} | |||
\rightarrow | |||
\begin{bmatrix} | |||
1 & & & & \\ | |||
& 0 & & & \\ | |||
& & \ddots & \\ | |||
& & & 0 \\ | |||
\end{bmatrix} | |||
</math> | |||
as | |||
<math> k \rightarrow \infty </math>. | |||
<br> | |||
The limit follows from the fact that the eigenvalue of | |||
<math> \frac{1}{\lambda_{1}} J_{i} </math> | |||
is less than 1 in magnitude, so | |||
<math> | |||
\left( \frac{1}{\lambda_{1}} J_{i} \right)^{k} \rightarrow 0 | |||
</math> | |||
as | |||
<math> k \rightarrow \infty </math> | |||
<br> | |||
It follows that: | |||
<br> | |||
<math> | |||
\frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} | |||
\left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) | |||
\rightarrow 0 | |||
</math> | |||
as | |||
<math> | |||
k \rightarrow \infty | |||
</math> | |||
<br> | |||
Using this fact, | |||
<math>b_{k}</math> | |||
can be written in a form that emphasizes its relationship with <math>v_{1}</math> when k is large: | |||
<br> | |||
<math> | |||
\begin{matrix} | |||
b_{k} &=& \left( \frac{\lambda_{1}}{|\lambda_{1}|} \right)^{k} \frac{c_{1}}{|c_{1}|} | |||
\frac{ v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} | |||
\left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right)} | |||
{\| v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} | |||
\left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \| } | |||
&=& e^{i \phi_{k}} \frac{c_{1}}{|c_{1}|} v_{1} + r_{k} | |||
\end{matrix} | |||
</math> | |||
where <math> e^{i \phi_{k}} = \left( \lambda_{1} / |\lambda_{1}| \right)^{k} </math> | |||
and | |||
<math> \| r_{k} \| \rightarrow 0 </math> | |||
as | |||
<math>k \rightarrow \infty </math> | |||
<br> | |||
The sequence | |||
<math> \left( b_{k} \right)</math> is bounded, so it contains a convergent subsequence. Note that the eigenvector corresponding to the dominant eigenvalue is only unique up to a scalar, so although the sequence <math>\left(b_{k}\right)</math> may not converge, | |||
<math>b_{k}</math> is nearly an eigenvector of ''A'' for large k. | |||
<br> | |||
<br> | |||
Alternatively, if ''A'' is [[diagonalizable]], then the following proof yields the same result | |||
<br> | |||
Let λ<sub>1</sub>, λ<sub>2</sub>, …, λ<sub>''m''</sub> be the <var>m</var> eigenvalues (counted with multiplicity) of <var>A</var> and let ''v''<sub>1</sub>, ''v''<sub>2</sub>, …, ''v''<sub>''m''</sub> be the corresponding eigenvectors. Suppose that <math>\lambda_1</math> is the dominant eigenvalue, so that <math>|\lambda_1| > |\lambda_j|</math> for <math>j>1</math>. | |||
The initial vector <math>b_0</math> can be written: | |||
:<math>b_0 = c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{m}v_{m}.</math> | |||
If <math>b_0</math> is chosen randomly (with uniform probability), then ''c''<sub>1</sub> ≠ 0 with [[Almost surely|probability 1]]. Now, | |||
:<math>\begin{array}{lcl}A^{k}b_0 & = & c_{1}A^{k}v_{1} + c_{2}A^{k}v_{2} + \cdots + c_{m}A^{k}v_{m} \\ | |||
& = & c_{1}\lambda_{1}^{k}v_{1} + c_{2}\lambda_{2}^{k}v_{2} + \cdots + c_{m}\lambda_{m}^{k}v_{m} \\ | |||
& = & c_{1}\lambda_{1}^{k} \left( v_{1} + \frac{c_{2}}{c_{1}}\left(\frac{\lambda_{2}}{\lambda_{1}}\right)^{k}v_{2} + \cdots + \frac{c_{m}}{c_{1}}\left(\frac{\lambda_{m}}{\lambda_{1}}\right)^{k}v_{m}\right). \end{array}</math> | |||
The expression within parentheses converges to <math>v_1</math> because <math>|\lambda_j/\lambda_1| < 1</math> for <math>j>1</math>. On the other hand, we have | |||
:<math> b_k = \frac{A^kb_0}{\|A^kb_0\|}. </math> | |||
Therefore, <math>b_k</math> converges to (a multiple of) the eigenvector <math>v_1</math>. The convergence is [[geometric sequence|geometric]], with ratio | |||
:<math> \left| \frac{\lambda_2}{\lambda_1} \right|, </math> | |||
where <math>\lambda_2</math> denotes the second dominant eigenvalue. Thus, the method converges slowly if there is an eigenvalue close in magnitude to the dominant eigenvalue. | |||
==Applications== | |||
Although the power iteration method approximates only one eigenvalue of a matrix, it remains useful for certain [[computational problem]]s. For instance, [[Google]] uses it to calculate the [[PageRank]] of documents in their search engine.<ref>{{cite article|author=Ipsen, Ilse, and Rebecca M. Wills| url=http://www4.ncsu.edu/~ipsen/ps/slides_imacs.pdf|title=Analysis and Computation of Google's PageRank|book=7th IMACS International Symposium on Iterative Methods in Scientific Computing|location=Fields Institute, Toronto, Canada|date=5–8 May 2005}}</ref> For matrices that are well-conditioned and as sparse as the Web matrix, the power iteration method can be more efficient than other methods of finding the dominant eigenvector. | |||
Some of the more advanced eigenvalue algorithms can be understood as variations of the power iteration. For instance, the [[inverse iteration]] method applies power iteration to the matrix <math>A^{-1}</math>. Other algorithms look at the whole subspace generated by the vectors <math>b_k</math>. This subspace is known as the [[Krylov subspace]]. It can be computed by [[Arnoldi iteration]] or [[Lanczos iteration]]. | |||
Another variation of the power method that simultaneously gives n eigenvalues and eigenfunctions, | |||
as well as accelerated convergence as <math> \left| \lambda_{n+1} / \lambda_1\right|, </math> is | |||
"Multiple extremal eigenpairs by the power method" | |||
in the Journal of Computational Physics | |||
Volume 227 Issue 19, October, 2008, | |||
Pages 8508-8522 (Also see pdf below for Los Alamos National Laboratory report LA-UR-07-4046) | |||
==See also== | |||
* [[Rayleigh quotient iteration]] | |||
* [[Inverse iteration]] | |||
==References== | |||
{{Reflist}} | |||
==External links== | |||
* [http://www.math.buffalo.edu/~pitman/courses/mth437/na2/node17.html Power method], part of lecture notes on numerical linear algebra by E. Bruce Pitman, State University of New York. | |||
* [http://math.fullerton.edu/mathews/n2003/PowerMethodMod.html Module for the Power Method] | |||
* [http://arxiv.org/pdf/0807.1261.pdf] Los Alamos report LA-UR-07-4046 ""Multiple extremal eigenpairs by the power method" | |||
{{Numerical linear algebra}} | |||
{{Use dmy dates|date=September 2010}} | |||
{{DEFAULTSORT:Power Iteration}} | |||
[[Category:Numerical linear algebra]] |
Revision as of 22:44, 8 January 2014
In mathematics, the power iteration is an eigenvalue algorithm: given a matrix A, the algorithm will produce a number λ (the eigenvalue) and a nonzero vector v (the eigenvector), such that Av = λv. The algorithm is also known as the Von Mises iteration.[1]
The power iteration is a very simple algorithm. It does not compute a matrix decomposition, and hence it can be used when A is a very large sparse matrix. However, it will find only one eigenvalue (the one with the greatest absolute value) and it may converge only slowly.
The method
The power iteration algorithm starts with a vector b0, which may be an approximation to the dominant eigenvector or a random vector. The method is described by the iteration
So, at every iteration, the vector bk is multiplied by the matrix A and normalized.
Under the assumptions:
- A has an eigenvalue that is strictly greater in magnitude than its other eigenvalues
- The starting vector has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue.
then:
Note that the sequence does not necessarily converge. It can be shown that:
where: is an eigenvector associated with the dominant eigenvalue, and . The presence of the term
implies that
does not converge unless
Under the two assumptions listed above, the sequence defined by:
converges to the dominant eigenvalue.
This can be run as a simulation program with the following simple algorithm:
for each(''simulation'') {
// calculate the matrix-by-vector product Ab
for(i=0; i<n; i++) {
tmp[i] = 0;
for (j=0; j<n; j++)
tmp[i] += A[i][j] * b[j]; // dot product of ith col in A with b
}
// calculate the length of the resultant vector
norm_sq=0;
for (k=0; k<n; k++)
norm_sq += tmp[k]*tmp[k];
norm = sqrt(norm_sq);
// normalize b to unit vector for next iteration
b = tmp/norm;
}
The value of norm converges to the absolute value of the dominant eigenvalue, and the vector b to an associated eigenvector.
Note: The above code assumes real A,b. To handle complex; A[i][j] becomes conj(A[i][j]), and tmp[k]*tmp[k] becomes conj(tmp[k])*tmp[k]
This algorithm is the one used to calculate such things as the Google PageRank.
The method can also be used to calculate the spectral radius of a matrix by computing the Rayleigh quotient
Analysis
Let be decomposed into its Jordan canonical form: , where the first column of is an eigenvector of corresponding to the dominant eigenvalue . Since the dominant eigenvalue of is unique, the first Jordan block of is the matrix , where is the largest eigenvalue of A in magnitude. The starting vector can be written as a linear combination of the columns of V: . By assumption, has a nonzero component in the direction of the dominant eigenvalue, so .
The computationally useful recurrence relation for
can be rewritten as:
,
where the expression: is more amenable to the following analysis.
The expression above simplifies as
as
.
The limit follows from the fact that the eigenvalue of
is less than 1 in magnitude, so
as
It follows that:
as
Using this fact,
can be written in a form that emphasizes its relationship with when k is large:
where
and
as
The sequence
is bounded, so it contains a convergent subsequence. Note that the eigenvector corresponding to the dominant eigenvalue is only unique up to a scalar, so although the sequence may not converge,
is nearly an eigenvector of A for large k.
Alternatively, if A is diagonalizable, then the following proof yields the same result
Let λ1, λ2, …, λm be the m eigenvalues (counted with multiplicity) of A and let v1, v2, …, vm be the corresponding eigenvectors. Suppose that is the dominant eigenvalue, so that for .
The initial vector can be written:
If is chosen randomly (with uniform probability), then c1 ≠ 0 with probability 1. Now,
The expression within parentheses converges to because for . On the other hand, we have
Therefore, converges to (a multiple of) the eigenvector . The convergence is geometric, with ratio
where denotes the second dominant eigenvalue. Thus, the method converges slowly if there is an eigenvalue close in magnitude to the dominant eigenvalue.
Applications
Although the power iteration method approximates only one eigenvalue of a matrix, it remains useful for certain computational problems. For instance, Google uses it to calculate the PageRank of documents in their search engine.[2] For matrices that are well-conditioned and as sparse as the Web matrix, the power iteration method can be more efficient than other methods of finding the dominant eigenvector.
Some of the more advanced eigenvalue algorithms can be understood as variations of the power iteration. For instance, the inverse iteration method applies power iteration to the matrix . Other algorithms look at the whole subspace generated by the vectors . This subspace is known as the Krylov subspace. It can be computed by Arnoldi iteration or Lanczos iteration. Another variation of the power method that simultaneously gives n eigenvalues and eigenfunctions, as well as accelerated convergence as is "Multiple extremal eigenpairs by the power method" in the Journal of Computational Physics Volume 227 Issue 19, October, 2008, Pages 8508-8522 (Also see pdf below for Los Alamos National Laboratory report LA-UR-07-4046)
See also
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
External links
- Power method, part of lecture notes on numerical linear algebra by E. Bruce Pitman, State University of New York.
- Module for the Power Method
- [1] Los Alamos report LA-UR-07-4046 ""Multiple extremal eigenpairs by the power method"
Template:Numerical linear algebra
30 year-old Entertainer or Range Artist Wesley from Drumheller, really loves vehicle, property developers properties for sale in singapore singapore and horse racing. Finds inspiration by traveling to Works of Antoni Gaudí.
- ↑ R. von Mises and H. Pollaczek-Geiringer, Praktische Verfahren der Gleichungsauflösung, ZAMM - Zeitschrift für Angewandte Mathematik und Mechanik 9, 152-164 (1929).
- ↑ Template:Cite article