![]() | Text provided under a Creative Commons Attribution license, CC-BY. All code is made available under the FSF-approved MIT license. (c) Kyle T. Mandli |
Note: This material largely follows the text “Numerical Linear Algebra” by Trefethen and Bau (SIAM, 1997) and is meant as a guide and supplement to the material presented there.
from __future__ import print_function
%matplotlib inline
import matplotlib.pyplot as plt
import numpyProjections¶
A projector is a square matrix that satisfies
Why does this definition make sense? Why do we require it to be square?
A projector comes from the idea that we want to project a vector onto a lower dimensional subspace. For example, suppose that lies completely within the subspace, i.e. . If that is the case then should not change, or . This motivates the definition above.
As another example, take a vector and project it onto the subspace . If we apply the projection again to we now have
It is also important to keep in mind the following, given again , if we look at the difference between the projection and the original vector and apply the projection again we have
which means the difference between and the projected vector lies in the null space of , i.e.
Complementary Projectors¶
A projector also has a complement defined to be .
Show that this complement is also a projector.
We can show that this a projector by examining a repeated application of :
We also know that
as well.
This shows that the
and
implying that
exactly.
Reflect on these subspaces and convince yourself that this all makes sense.
This result provides an important property of a projector and its complement, namely that they divide a space into two subspaces whose intersection is
or
These two spaces are called complementary subspaces.
Given this property we can take any which will split into two subspaces and , assume that , and . If we have that we can split the vector into components in and by using the projections
which we can also observe adds to the original vector as
Try constructing a projection matrix so that that projects a vector into one of the coordinate directions ().
What is the complementary projector?
What is the complementary subspace?
Example: A non-orthogonal non-linear projector¶
Given a vector of mols of chemical components
where (e.g. is the number of moles of component 1) and
Then we can define the mol fraction of component as
and the “vector” of mole fractions
In the homework you will show that
(the sum of the mole fractions add to 1)
mole fractions do not form a vector space or subspace
There exists a non Orthogonal projector such that ,
is singular (like all projection matrices) such that if you know the mole-fractions you don’t know how many moles you have.
Find
Orthogonal Projectors¶
An orthogonal projector is one that projects onto a subspace that is orthogonal to the complementary subspace (this is also phrased that projects along a space ). Note that we are only talking about the subspaces (and their basis), not the projectors! i.e. for orthogonal subspaces all the vectors in are orthogonal to all the vectors in .
A hermitian matrix is one whose complex conjugate transposed is itself, i.e.
With this definition we can then say: A projector is orthogonal if and only if is hermitian.
Projection with an Orthonormal Basis¶
We can also directly construct a projector that uses an orthonormal basis on the subspace . If we define another matrix which is unitary (its columns are orthonormal) we can construct an orthogonal projector as
Check that and (Remember )
Note that the resulting matrix is in as we require. This means also that the dimension of the subspace is .
Example: Orthonormal projection and Least-Squares Problems... A review¶
Consider the overdetermined problem where and i.e.
and , are linearly independent vectors that span a two-dimensional subspace of .
If , then there is clearly no solution to . However, we can find the point that minimizes the length of the the error . While we could resort to calculus to find the values of that minimizes . It should be clear from the figure that the shortest error (in the norm) is the one that is perpendicular to every vector in .
But the sub-space of vectors orthogonal to is the left-Null Space , and therefore we simply seek solutions of
or we just need to solve the “Normal Equations” for the least-squares solution
if we’re actually interested in which is the orthogonal projection of onto we get
where
is an orthogonal projection matrix (verify that and
For a general matrix , this form of the projection matrix is rather horrid to find, however, if the columns of formed an orthonormal basis for , i.e. , then the form of the projection matrix is much simpler as , therefore
This is actually quite general. Given any orthonormal basis for a vector space . If these vectors form the columns of , then the orthogonal projector onto is always and the complement is always .
Example: Construction of an orthonormal projector
Take and derive a projector that projects onto the x-y plane and is an orthogonal projector.
the simplest Orthonormal basis for the plane are the columns of
and an orthogonal projector onto the plane is simply
Q = numpy.array([[1.0, 0.0], [0.0, 1.0], [0.0, 0.0]])
P = numpy.dot(Q, Q.T)
I = numpy.identity(3)
x = numpy.array([3.0, 4.0, 5.0])
print("x = {}".format(x))
print("\nQ = \n{}".format(Q))
print("\nP = \n{}".format(P))x_S = numpy.dot(P, x)
x_V = numpy.dot(I - P, x)
print("x_S = {}".format(x_S))
print("x_V = {}\n".format(x_V))
print("x_S + x_V = {}".format(x_S + x_V))A numerically more sensible approach¶
The previous problem calculated the projection matrix first, then calculated the projection and its complement by matrix vector multiplication, i.e.
A mathematically equivalent, but numerically more efficient method is to calculate the following
How much more efficient is the latter than the former?
Check¶
Q = numpy.array([[1, 0], [0, 1], [0, 0]])
x = numpy.array([3.0, 4.0, 5.0])
print("x = {}".format(x))x_S = Q.dot(Q.T.dot(x))
x_V = x - x_S
print("x_S = {}".format(x_S))
print("x_V = {}\n".format(x_V))
print("x_S + x_V = {}".format(x_S + x_V))Example: Construction of a projector that eliminates a direction¶
Goal: Eliminate the component of a vector in the direction .

e.g. We can calculate the projection in two equivalent ways
find the matrix that projects onto
find the matrix that projects onto the unit vector normal to the plane (i.e. parallel to ), and then take its Complement projection
Form the projector . The complement will then include everything BUT that direction.
If we can then simply use . If not we can write the projector in terms of the arbitrary vector as
Note that differences in the resulting dimensions between the two values in the fraction. Also note that as we saw with the outer product, the resulting .
Now again try to construct a projector in that projects onto the - plane.
q = numpy.array([0, 0, 1])
P = numpy.outer(q, q.conjugate())
P_comp = numpy.identity(3) - P
print("P = \n{}\n".format(P))
print("I - P = \n{}".format(P_comp))x = numpy.array([3, 4, 5])
print(numpy.dot(P, x), q * (q.dot(x)))
print(numpy.dot(P_comp, x), x - q * (q.dot(x)))a = numpy.array([0, 0, 3])
P = numpy.outer(a, a.conjugate()) / (numpy.dot(a, a.conjugate()))
P_comp = numpy.identity(3) - P
print(numpy.dot(P, x))
print(numpy.dot(P_comp, x))Quick Review¶
A projection matrix is any square matrix such that
projects onto a subspace
The complementary projection matrix projects onto
An orthogonal projector can always be constructed as where the columns of form an orthonormal basis for and
Solutions of Least-squares problems are essentially projection problems where we seek to solve where
Solution of Least Squares problems by the Normal Equations¶
given where we can always solve them using the Normal Equations
which actually solves where is the orthogonal projection of onto
However...as you will show in the homework, this can be numerically innaccurate because the condition number
but there is a better way
Solution of Least Squares problems by the QR factorization¶
given any matrix that is full column rank, we will show that we can always factor it as
where is a unitary matrix whose columns form an orthonormal basis for , and is an upper triangular matrix that says how to reconstruct the columns of from the columns of
i.e we can construct the columns of from linear combinations of the columns of . (And those specific combinations come from )
or since
which can be solve quickly by back-substitution as it is a triangular matrix.
Moreover, this problem is much better conditioned as it can be shown that not
Multiplying both sides by again shows that this problem is equivalent to
or i.e. we are just solving where is the orthogonal projection of onto
Question... how to find ?
QR Factorization¶
One of the most important ideas in linear algebra is the concept of factorizing an original matrix into different constituents that may have useful properties. These properties can help us understand the matrix better and lead to numerical methods. In numerical linear algebra one of the most important factorizations is the QR factorization.
There are actually multiple algorithm’s that accomplish the same factorization but with very different methods and numerical stability. Here we will discuss three of the algorithms
Classical Gram-Schmidt Orthogonalization: Transform by a succession of projections and calculate as a by-product of the algorithm. (Unfortunately, prone to numerical floating point error)
Modified Gram-Schmidt Orthogonalization: Transform by a different set of projections. Yields the same but more numerically stable.
Householder Triangularization: Transform by a series of Unitary transformations, can solve least squares problems directly without accumulating , or can build on the fly.
And want to find a sequence of orthonormal vectors that span the sequence of subspaces
Classical Gram-Schmidt¶
We begin with a matrix with linearly independent columns.
And want to find a sequence of orthonormal vectors that span the sequence of subspaces
where here indicates the subspace spanned by the vectors .
The individual will form the columns of the matrix such that where the is a matrix with the first columns of A (or Q)
Starting with the first vector , this forms the basis for a 1-dimensional subspace of (i.e. a line). Thus is a unit vector in that line or
For (which is a plane in we already have so we need to find a vector that is orthogonal to , but still in the plane.
An obvious option is to find the component of that is orthogonal to but this is just
By construction it’s easy to show that is orthogonal to but not necessarily a unit vector but we can find by normalizing
Classical Gram-Schmidt as a series of projections¶
If we define the unitary matrix
as the first columns of Q, then we could also rewrite the first two steps of Classical Gram-Schmidt as
set , then $$
$$
and we can continue $$
$$
With each step finding the component of that is orthogonal to all the other vectors before it, and normalizing it.
A picture is probably useful here...
But what about ?¶
The above algorithm appears to transform directly to without calculating (which we need for least-squares problems).
But actually that’s not true as is hiding in the algorithm (much like hides in the factorizations )
If we consider the full factorization
it follows that or $$ \begin{bmatrix} r_{11} & r_{12} & \cdots & r_{1n} \ & r_{22} & & \ & & \ddots & \vdots \ & & & r_{nn}\end{bmatrix} =
.Can't use function '$' in math mode at position 56: … show that the $̲j$th column of …
Which with a little bit of work, you can show that the $j$th column of $R$\mathbf{r}_j = Q_j^T\mathbf{a}_j$$
So the full Classical Gram-Schmidt algorithm looks something like
Set , ,
Loop over columns for
Find
Which builds up both and
Or unrolled
leading us to define
This is called the classical Gram-Schmidt iteration. Turns out that the procedure above is unstable because of rounding errors introduced.
Which can be easily coded up in python as
# Implement Classical Gram-Schmidt Iteration
def classic_GS(A):
m, n = A.shape
Q = numpy.empty((m, n))
R = numpy.zeros((n, n))
# loop over columns
for j in range(n):
v = A[:, j]
for i in range(j):
R[i, j] = numpy.dot(Q[:, i].conjugate(), A[:, j])
v = v - R[i, j] * Q[:, i]
R[j, j] = numpy.linalg.norm(v, ord=2)
Q[:, j] = v / R[j, j]
return Q, RAnd check¶
A = numpy.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]], dtype=float)
print(A)Q, R = classic_GS(A)
print("Q=\n{}\n".format(Q))
print("Q^TQ=\n{}".format(numpy.dot(Q.transpose(), Q)))print("R=\n{}\n".format(R))
print("QR - A=\n{}".format(numpy.dot(Q, R) - A))Full vs. Reduced QR¶
If the original matrix where then we can still define a QR factorization, called the full QR factorization, which appends columns full of zeros to to reproduce the full matrix.
The factorization is called the reduced or thin QR factorization of .
We require that the additional columns added are an orthonormal basis that is orthogonal itself to . If is full ranked then and provide a basis for and respectively.
QR Existence and Uniqueness¶
Two important theorems exist regarding this algorithm which we state without proof:
Every with has a full QR factorization and therefore a reduced QR factorization.
Each with of full rank has a unique reduced QR factorization with .
Modified Gram-Schmidt¶
Unfortunately the classical Gram-Schmidt algorithm is is not stable numerically. Instead we can derive a modified method that is more numerically stable but calculates the same and with just a different order of projections.
Recall that the basic piece of the original algorithm was to take the inner product of and all the relevant . Using the rewritten version of Gram-Schmidt in terms of projections we then have
Each of these projections is of different rank although we know that the resulting are linearly independent by construction. The modified version of Gram-Schmidt instead uses projections that are all of rank . To construct this projection remember that we can again construct the complement to a projection and perform the following sequence of projections
where
which projects onto the complementary space orthogonal to .
Note that this performs mathematically the same job as however each of these projectors are of rank . The reason why this approach is more stable is that we are not projecting with a possibly arbitrarily low-rank projector, instead we only take projectors that are high-rank.
again...a picture is probably worth a lot here.
This leads to the following set of calculations:
The reason why this approach is more stable is that we are not projecting with a possibly arbitrarily low-rank projector, instead we only take projectors that are high-rank.
Example: Implementation of modified Gram-Schmidt Implement the modified Gram-Schmidt algorithm checking to make sure the resulting factorization has the required properties.
# Implement Modified Gram-Schmidt Iteration
def mod_GS(A):
m, n = A.shape
Q = numpy.empty((m, n))
R = numpy.zeros((n, n))
v = A.copy()
for i in range(n):
R[i, i] = numpy.linalg.norm(v[:, i], ord=2)
Q[:, i] = v[:, i] / R[i, i]
for j in range(i + 1, n):
R[i, j] = numpy.dot(Q[:, i].conjugate(), v[:, j])
v[:, j] -= R[i, j] * Q[:, i]
return Q, RAnd check¶
A = numpy.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]], dtype=float)
print(A)Q, R = mod_GS(A)
print("Q=\n{}\n".format(Q))
print("Q^TQ=\n{}".format(numpy.dot(Q.transpose(), Q)))print("R=\n{}\n".format(R))
print("QR - A=\n{}".format(numpy.dot(Q, R) - A))Householder Triangularization¶
One way to also interpret Gram-Schmidt orthogonalization is as a series of right multiplications by upper triangular matrices of the matrix A. For instance the first step in performing the modified algorithm is to divide through by the norm to give :
We can also perform all the step (2) evaluations by also combining the step that projects onto the complement of by add the appropriate values to the entire first row:
The next step can then be placed into the second row:
If we identify the matrices as for the first case, for the second case and so on we can write the algorithm as
This view of Gram-Schmidt is called Gram-Schmidt triangularization.
Householder triangularization is similar in spirit. Instead of multiplying on the right Householder multiplies on the left by unitary matrices . Remember that a unitary matrix (or an orthogonal matrix when strictly real) has as its inverse its adjoint (transpose when real) so that . We therefore have
which if we identify and note that if then is also unitary.
Householder Triangularization¶
While Gram-Schmidt and its variants transform and calculate on the way. Householder triangularization is a rather different algorithm that transforms by multiplying by a series of square Unitary matrices that systematically put zeros in the subdiagonal (similar to the LU factorization)
which if we identify then we can define which is also unitary and
We can then write this as
This was we can think of Householder triangularization as one of introducing zeros into via orthogonal matrices.
Now the question is how do we construct the . The construction is usually broken down into a block matrix of the form
where is the identity matrix that preserves the top rows
and is a unitary matrix that just modifies the lower rows.
Note that this will leave the rows and columns we have already worked on alone and be unitary.
We now turn to the task of constructing the matrix . Note that the definition of implies that $$
Can't use function '$' in math mode at position 13: The vectors $̲\mathbf{x}_{top…
The vectors $\mathbf{x}_{top}$ and $\mathbf{x}_{bottom}$ are defined to be$HH\mathbf{x}_{bottom}$ results in a vector that has some nonzero number in the top position and the rest of the numbers are all zeroes.
One unitary operation that does this is the Householder Reflection Matrix that reflects the vector over the hyper plane that is normal to the vector so that :

This is of course the effect on only one vector. Any other vector will be reflected across (technically a hyperplane) which is orthogonal to
This has a similar construction as to the projector complements we were working with before. Consider the projector defined as
where
This vector is now orthogonal to . i.e. lies in the plane
Since we actually want to transform to lie in the direction of we need to go twice as far as which allows us to identify the matrix as

There is actually a non-uniqueness to which direction we reflect over since another definition of which is orthogonal to the one we originally choose is available. For numerical stability purposes we will choose the reflector that is the most different from . This comes back to having difficulties numerically when the vector is nearly aligned with and therefore one of the specification. By convention the chosen is defined by
# Implementation of Householder QR Factorization
def householder_QR(A, verbose=False):
R = A.copy()
m, n = A.shape
QT = numpy.eye(m)
for k in range(n):
x = numpy.zeros(m)
e = numpy.zeros(m)
x[k:] = R[k:, k]
e[k] = 1.0
# simplest version v = ||x||e - x
# v = numpy.linalg.norm(x, ord=2) * e - x
# alternate version
v = numpy.sign(x[k]) * numpy.linalg.norm(x, ord=2) * e + x
v = v / numpy.linalg.norm(v, ord=2)
R -= 2.0 * numpy.outer(v, numpy.dot(v.T, R))
QT -= 2.0 * numpy.outer(v, numpy.dot(v.T, QT))
Q = QT.T[:, :n]
return Q, RA = numpy.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]], dtype=float)
print("Matrix A = ")
print(A)%precision 6
Q, R = householder_QR(A, verbose=False)
print("Householder (reduced) Q =\n{}\n".format(Q))
print("Householder (full) R = ")
print(R)m, n = A.shape
print(
"Check to see if factorization worked...||A -QR|| = {}".format(
numpy.linalg.norm(A - numpy.dot(Q, R[:n, :n]))
)
)
print(A - numpy.dot(Q, R[:n, :n]))
print(
"\nCheck if Q is unitary...||Q^TQ -I|| = {}".format(
numpy.linalg.norm(Q.T.dot(Q) - numpy.eye(n))
)
)
print(Q.T.dot(Q))Comparison of accuracy of the different algorithms¶
As it turns out, not all algorithms produce the same quality of orthogonalization. Here we provide a few examples that compare the behavior of the 3 different algorithms and numpy.linalg.qr
Example 1: Random Matrix QR¶
Here we construct a large matrix with a random eigenspace and widely varying eigenvalues. The values along the diagonal of gives us some idea of the size of the projections as we go, i.e. the larger the values the less effective we are in constructing orthogonal directions.
N = 80
# construct a random matrix with known singular values
U, X = numpy.linalg.qr(numpy.random.random((N, N)))
V, X = numpy.linalg.qr(numpy.random.random((N, N)))
S = numpy.diag(2.0 ** numpy.arange(-1.0, -(N + 1), -1.0))
A = numpy.dot(U, numpy.dot(S, V))
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)
Q, R = classic_GS(A)
axes.semilogy(numpy.diag(R), "bo", label="Classic")
Q, R = mod_GS(A)
axes.semilogy(numpy.diag(R), "ro", label="Modified")
Q, R = householder_QR(A)
axes.semilogy(numpy.diag(R), "ko", label="Householder")
Q, R = numpy.linalg.qr(A)
axes.semilogy(numpy.diag(R), "go", label="numpy")
axes.set_xlabel("Index", fontsize=16)
axes.set_ylabel("$R_{ii}$", fontsize=16)
axes.legend(loc=3, fontsize=14)
axes.plot(numpy.arange(0, N), numpy.ones(N) * numpy.sqrt(numpy.finfo(float).eps), "k--")
axes.plot(numpy.arange(0, N), numpy.ones(N) * numpy.finfo(float).eps, "k--")
plt.show()%precision 16
A = numpy.array([[0.7, 0.70711], [0.70001, 0.70711]])
print("\ncond(A) = {}".format(numpy.linalg.cond(A)))To check that the matrix is really unitary, we compute with the different algorithms and compare
Q_c, R = classic_GS(A)
r_c = numpy.linalg.norm(numpy.dot(Q_c.transpose(), Q_c) - numpy.eye(2))
print("Classic: ", r_c)
Q, R = mod_GS(A)
print("Modified: ", numpy.linalg.norm(numpy.dot(Q.transpose(), Q) - numpy.eye(2)))
Q_h, R = householder_QR(A)
r_h = numpy.linalg.norm(numpy.dot(Q_h.transpose(), Q_h) - numpy.eye(2))
print("Householder:", r_h)
Q, R = numpy.linalg.qr(A)
r = numpy.linalg.norm(numpy.dot(Q.transpose(), Q) - numpy.eye(2))
print("Numpy: ", r)
print("\ncond(A) = {}".format(numpy.linalg.cond(A)))
print("r_classic/r_householder = {}".format(r_c / r_h))Applications of the QR¶
The factorization and unitary transformation such as Householder reflections, play important roles in a wide range of algorithms for Numerical Algorithms
Least-Squares problems: Solving with QR¶
We have already discussed solving overdetermined least-squares problems using the factorization. In general, if we seek least-squares solutions to and , the problem reduces to solving the triangular problem
If we use Householder triangularization to transform directly, we do not need to explicitly form the matrix and can save memory and computation. If we consider the augmented system and Apply the same sequence of unitary, rank 1 tranformations we used in the Householder algorithm, the sequence becomes
Thus we just need to solve
Alternatively, during the factorization by Householder, we can just store the vectors and reconstruct by rank-1 updates as neccessary.
Finding Eigenvalues¶
As it turns out, the factorization and Householder transformation are also extremely useful for finding eigenvalues of a matrix . There is an entire notebook 13_LA_eigen.ipynb that develops these ideas in detail. Here we will just discuss the parts that relate to the and orthogonalization algorithms.
The basics¶
The eigenproblem
can be rewritten as
which implies that the eigenvectors are in the Null space of .
However for this matrix to have a non-trivial Null space, requires that is singular.
Characteristic Polynomial¶
If is singular, it follows that
where can be shown to be a th order polynomial in known as the characteristic polynomial of a matrix
Computing Eigenvalues¶
In basic linear algebra classes we usually find the eigenvalues by directly calculating the roots of which can work for low-degree polynomials. Unfortunately the following theorem (due to Galois) suggests this is not a good way to compute eigenvalues:
Theorem: For an there is a polynomial of degree with rational coefficients that has a real root with the property that cannot be written using any expression involving rational numbers, addition, subtraction, multiplication, division, and th roots.
I.e., there is no way to find the roots of a polynomial of degree in a deterministic, fixed number of steps.
Not all is lost however!¶
We just must use an iterative approach where we construct a sequence that converges to the eigenvalues.
Some Questions
How does this relate to how we found roots previously?
Why will it still be difficult to use our rootfinding routines to find Eigenvalues?
We will return to how we actually find Eigenvalues (and roots of polynomials) after a bit more review
Similarity Transformations¶
Generally, we say any two matrices and are similar if they can be related through an invertible matrix as
Example
The general Eigen problem
is really problems for each eigenvalue, eigenvector pair i.e.
which can be written concisely in matrix form as
where is a matrix whose columns contain the eigenvectors and is a diagonal matrix of corresponding eigenvalues. This form is always true
Example Diagonalizable matrices¶
If a matrix has linearly independent eigenvectors, then we say is diagonalizable and we can factor it as
Where is a matrix of eigenvectors, and is a diagonal matrix of corresponding Eigenvalues.
which says is similar to with similarity transform
Theorem:¶
If and are similar matrices, they have the same eigenvalues and their eigenvectors are related through an invertible matrix
Schur Factorization¶
A Schur factorization of a matrix is defined as
where is unitary and is upper-triangular. Because (for square unitary matrices). It follows directly that and are similar.
Good News! is upper triangular so its eigenvalues can just be read of the diagonal
Bad News! There is no deterministic way to calculate as that would violate Galois theory of polynomials
Theorem: Every matrix has a Schur factorization.
Partial Proof for a diagonalizable matrix. If is diagonalizable, . But we know we can always factor and substitute to show
and it is not hard to show that the product is also an upper triangular matrix (exercise left to the reader).
(For a non-diagonalizable matrix the proof requires showing the existence of the Jordan form )
Note that the above results imply the following
An eigen-decomposition exists if and only if is non-defective (it has a complete set of eigenvectors)
A unitary transformation exists if and only if is Hermitian ()
A Schur factorization always exists
Note that each of these lead to a means for isolating the eigenvalues of a matrix and will be useful when considering algorithms for finding them.
Hessenberg form¶
The first step to finding the Schur factorization is to try and get as close to triangular as possible without changing its eigenvalues. This requires a series of similarity transformations.
As it turns out, the closest we can do is to reduce it to Hessenberg form which is upper triangular with one extra subdiagonal. Which can be done with explicit similarity transformations using Householder Transformations
so we have the sequence which has the same eigenvalues as the original matrix .
Question? Why can’t we just use Householder to take like we did for the ?
QR/RQ Algorithm¶
Given a matrix in Hessenberg form, it turns out we can use repeated factorizations to reduce the size of the subdiagonal and iterate towards the Schur factorization to find all the eigenvalues simultaneously.
The simplest algorithm just iterates
while not converged:
Q, R = numpy.linalg.qr(A)
A = R.dot(Q) calculating the factorization of , then forming a new , This sequence will eventually converge to the Schur decomposition of the matrix .
Code this up and see what happens.
%precision 6
m = 3
A0 = numpy.array([[2, 1, 1], [1, 3, 1], [1, 1, 4]])MAX_STEPS = 10
A = A0
print("A=")
print(A)
for i in range(MAX_STEPS):
Q, R = numpy.linalg.qr(A)
A = numpy.dot(R, Q)
print()
print("A({}) =".format(i))
print(A)print()
print("True eigenvalues: ")
print(numpy.sort(numpy.linalg.eigvals(A0)))
print()
print("Computed eigenvalues: ")
print(numpy.sort(numpy.diag(A)))So why does this work? The first step is to find the factorization
which is equivalent to finding
and multiplying on the right leads to
In this way we can see that this is a similarity transformation of the matrix since the is an orthogonal matrix (). This of course is not a great idea to do directly but works great in this case as we iterate to find the upper triangular matrix which is exactly where the eigenvalues appear.
In practice this basic algorithm is modified to include a few additions:
Before starting the iteration is reduced to tridiagonal or Hessenberg form.
Motivated by the inverse power iteration we observed we instead consider a shifted matrix for factoring. The picked is related to the estimate given by the Rayleigh quotient. Here we have
Deflation is used to reduce the matrix into smaller matrices once (or when we are close to) finding an eigenvalue to simplify the problem.
This has been the standard approach until recently for finding eigenvalues of a matrix.
Application: Finding the roots of a polynomial¶
Numpy has a nice function called roots which returns the roots of a th degree polynomial
described by a vector of coefficients
p = numpy.array([1, 1, -1])
r = numpy.roots(p)
print(r)p = numpy.random.rand(8)
r = numpy.roots(p)
print(r)This routine, does not try and actually find the roots of a high-order polynomial, instead it actually calculates the eigenvalues of a companion matrix whose characteristic polynomial is the monic polynomial
It can be shown that this matrix can be constructed as (see e.g.)
def myroots(p, verbose=False):
"""Calculate the roots of a polynomial described by coefficient vector
in numpy.roots order
p(x) = p_0 x^n + p_1 x^{n-1} + p_2 x^{n-2} + \ldots + p_n
by finding the eigenvalues of the companion matrix
returns:
--------
eigenvalues sorted by |\lambda|
"""
# construct the companion matrix of the coefficient vector c
# make p monic and reverse the order for this definition of the companion matrix
c = numpy.flip(p / p[0])
if verbose:
print(c)
m = len(c) - 1
C = numpy.zeros((m, m))
C[:, -1] = -c[:-1]
C[1:, :-1] = numpy.eye(m - 1)
if verbose:
print("C = \n{}".format(C))
# calculate the eigenvalues of the companion matrix, then sort by |lambda|
eigs = numpy.linalg.eigvals(C)
index = numpy.flip(numpy.argsort(numpy.abs(eigs)))
return eigs[index]p = numpy.array([1, 1, -1])
r = numpy.roots(p)
print(r)
mr = myroots(p)
print
print(mr)%precision 4
p = numpy.random.rand(5)
r = numpy.roots(p)
print("nproots: {}".format(r))
mr = myroots(p)
print()
print("myroots: {}".format(mr))