![]() | Text provided under a Creative Commons Attribution license, CC-BY. All code is made available under the FSF-approved MIT license. (c) Kyle T. Mandli |
Note: This material largely follows the text “Numerical Linear Algebra” by Trefethen and Bau (SIAM, 1997) and is meant as a guide and supplement to the material presented there.
from __future__ import print_function
%matplotlib inline
import matplotlib.pyplot as plt
import numpyConditioning and Stability¶
Once an approximation to a linear system is constructed the next question is how much trust can we put in the approximation? Since the true solution is not known, one of the few tools we have is to ask how well the approximation matches the original equation. In other words, we seek a solution to a system,
We do not have but instead have an approximation, , and we hope that
In this section the question we explore is to try to determine a bound on the relative error, given the matrix, .
This leads to the notion of conditioning. Conditioning is the behavior of a problem when the solution is a changed a small bit (perturbed), and it is a mathematical (analytic) property of the original system of equations. Stability, on the other hand, is concerned with how the algorithm used to obtain an approximation behaves when the approximation is perturbed.
Conditioning and Condition Numbers¶
A well-conditioned problem is one where a small perturbation to the original problem leads to only small changes in the solution.
Formally we can think of a function which maps to
Let where we perturb with and we ask how the result changes:
for some constant possible dependent on depending on the type of conditioning we are considering.
Absolute Condition Number¶
If we let be the small perturbation to the input and be the result the absolute condition number can be defined as
for most problems (assuming and are both infinitesimal).
When is differentiable we can evaluate the condition number via the Jacobian. Recall that the derivative of a multi-valued function can be termed in the form of a Jacobian where
This allows us to write the infinitesimal as
with equality when . Then we can write the condition number as
where the norm is the one induced by the spaces and .
Relative Condition Number¶
The relative condition number is defined similarly and is related to the difference before between the absolute error and relative error as defined previously. With the same caveats as before it can be defined as
Again if is differentiable we can use the Jacobian to evaluate the relative condition number as
Calculate the relative condition number for the scalar function using the vector using an norm.
The condition number of a function was discussed in general terms above. Now, the more specific case of a linear function, a matrix-vector multiplication, is examined. Here we let and determine the condition number by perturbing .
We begin with the definition above,
where is a vector.
If has an inverse, then we note that $$
\frac{||x||}{||A x||} \leq ||A^{-1}||.\kappa \leq ||A|| ||A^{-1}||.$$
Condition Number of a Matrix¶
The condition number of a matrix is defined by the product
where here we are thinking about the matrix rather than a problem. If is small than is said to be well-conditioned. If is singular we assign as the matrix’s condition number.
When we are considering the norm then we can write the condition number as
Condition Number of a System of Equations¶
Another way to think about the conditioning of a problem we have looked at before is that the matrix itself is an input to the problem. Consider than the system of equations where we will perturb both and resulting in
Assuming we solve the problem exactly we know that and that the infinitesimals multiplied are smaller than the other term, and the above expression can be approximation by
We can also say the following regarding the condition number of a system of equations then
Theorem: Let be fixed and consider the problem of computing in where is square and non-singular. The condition number of this problem with respect to perturbations in is the condition number of the matrix .
Stability¶
We now return to the consideration of the fact that we are interested not only in the well-conditioning of a mathematical problem but in how we might solve it on a finite precision machine. In some sense conditioning describes how well we can solve a problem in exact arithmetic and stability how well we can solve the problem in finite arithmetic.
Accuracy and Stability¶
As we have defined before we will consider absolute error as
where is the approximation to the true solution . Similarly we can define relative error as
In the ideal case we would like the relative error to be .
Forwards Stability¶
A forward stable algorithm for has
In other words
A forward stable algorithm gives almost the right answer to exactly the right question.
Backwards Stability¶
A stronger notion of stability can also be defined which is satisfied by many approaches in numerical linear algebra. We say that an algorithm is backward stable if for we have
for some with
In other words
A backward stable algorithm gives exactly the right answer to nearly the right question.
Combining these ideas along with the idea that we should not expect to be able to accurately compute the solution to a poorly conditioned problem we can form the mixed forward-backward sense of stability as for if
for some with
In other words
A stable algorithm gives nearly the right answer to nearly the right question.
An important aspect of the above statement is that we can not necessarily guarantee an accurate result. If the condition number is small we would expect that a stable algorithm would give us an accurate result (by definition). This is reflected in the following theorem.
Theorem: Suppose a backward stable algorithm is applied to solve a problem with condition number on a finite precision machine, then the relative errors satisfy
Proof: By the definition of the condition number of a problem we can write
Combining this with the definition of backwards stability we can arrive at the statement of the theorem.
Backward Error Analysis - Process of using the condition number of the problem and stability of the algorithm to determine the error.
Forward Error Analysis - Considers the accrual of error at each step of an algorithm given slightly perturbed input.
Stability of using Householder Triangularization¶
As an example lets consider the conditioning and algorithm for solving . Here we will use a factorization approach to solve given by a Householder triangularization. First off lets discuss the factorization itself.
Theorem: Let the factorization of a matrix be computed using a Householder triangularization approach on a finite precision machine, then
for some where and are the finite arithmetic versions of and . Householder triangularization is therefore backward stable.
Solving with Factorization¶
So Householder triangularization is backwards stable but we also know that this does not guarantee accuracy if the problem itself is ill-conditioned. Is backward stability enough to guarantee accurate results if we use it for for instance? It turns out that the accuracy of the product of is enough to guarantee accuracy of a larger algorithm.
Consider the steps to solving using factorization:
Compute the factorization of
Multiply the vector by so that .
Solve using backward-substitution the triangular system or .
We know that step (1) is backward stable, what about step (2), the matrix-vector multiplication? We can write the estimate of the backwards stability as
where we have inverted the matrix since it is unitary. Since this is exact we know also that the matrix-vector multiplication is also backwards stable since this is an equivalent statement to multiplying by a slightly perturbed matrix.
Step (3) is backward substitution (or the computation of ). Writing the backwards stability estimate we have
demonstrating that the results is the exact solution to a slight perturbation of the original problem.
These results lead to the following two theorems:
Theorem: Using factorization to solve as described above is backward stable, satisfying
for some .
Theorem: The solution computed by the above algorithm satisfies
