![]() | Text provided under a Creative Commons Attribution license, CC-BY. All code is made available under the FSF-approved MIT license. (c) Kyle T. Mandli |
Note: This material largely follows the text “Numerical Linear Algebra” by Trefethen and Bau (SIAM, 1997) and is meant as a guide and supplement to the material presented there.
from __future__ import print_function
%matplotlib inline
import matplotlib.pyplot as plt
import numpyIterative Methods¶
In this lecture we will consider a number of classical and more modern methods for solving sparse linear systems like those we found from our consideration of boundary value problems.
Motivation: Direct Methods¶
Gaussian elimination or other direct methods for solving linear systems of equations but are there other, better methods for solving these systems when there is structure that can be exploited?
For this lecture we will be considering an important type of linear system of equations that arise from discretizing the Poisson problem in one-dimension
How might you discretize the above ODE?
Following our finite difference discussion we can setup the following set of algebraic equations:
This forms the th row of the matrix. See if you can write down this matrix in the form of ignoring the boundary conditions for now.
Computational Load¶
Now consider using Gaussian elimination on the above matrix. For good measure let us consider a 3D problem and discretize each dimension with a leading to unknowns.
Gaussian Elimination - operations to solve, operations.
Suppose you have a machine that can perform 100 gigaflops (floating point operations per second):
Memory Load¶
What about memory?
We require to store entire array. In double precision floating point we would require 8-bytes per entry leading to
The situation really is not as bad as we are making it out to be as long as we take advantage of the sparse nature of the matrices. In fact for 1 dimensional problems direct methods can be reduced to in the case for a tridiagonal system. The situation is not so great for higher-dimensional problems however unless more structure can be leveraged. Examples of these types of solvers include fast Fourier methods such as fast Poisson solvers.
Jacobi and Gauss-Seidel¶
The Jacobi and Gauss-Seidel methods are simple approaches to introducing an iterative means for solving the problem when is sparse. Consider the general equation derived from the Poisson problem
If we rearrange this expression to solve for we have
For a direct method we would simultaneously find the values of , and but instead consider the iterative scheme computes an update to the equation above by using the past iterate (values we already know)
Since this allows us to evaluate without knowing the values of we directly evaluate this expression! This process is called Jacobi iteration. It can be shown that for this particular problem Jacobi iteration will converge from any initial guess although slowly.
Advantages
Matrix is never stored or created
Storage is optimal
are required per iteration
Example¶
Let’s try to solve the problem before in the BVP section but use Jacobi iterations to replace the direct solve
# Problem setup
a = 0.0
b = 1.0
u_a = 0.0
u_b = 3.0
f = lambda x: numpy.exp(x)
u_true = lambda x: (4.0 - numpy.exp(1.0)) * x - 1.0 + numpy.exp(x)
# Descretization
N = 100
x_bc = numpy.linspace(a, b, N + 2)
x = x_bc[1:-1]
delta_x = (b - a) / (N + 1)
# Expected iterations needed
iterations_J = int(
2.0 * numpy.log(delta_x) / numpy.log(1.0 - 0.5 * numpy.pi**2 * delta_x**2)
)
# Solve system
# Initial guess for iterations
U_new = numpy.zeros(N + 2)
U_new[0] = u_a
U_new[-1] = u_b
convergence_J = numpy.zeros(iterations_J)
for k in range(iterations_J):
U = U_new.copy()
for i in range(1, N + 1):
U_new[i] = 0.5 * (U[i + 1] + U[i - 1]) - f(x_bc[i]) * delta_x**2 / 2.0
convergence_J[k] = numpy.linalg.norm(u_true(x_bc) - U_new, ord=2)
# Plot result
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.plot(x_bc, U, "o", label="Computed")
axes.plot(x_bc, u_true(x_bc), "k", label="True")
axes.set_title("Solution to $u_{xx} = e^x$")
axes.set_xlabel("x")
axes.set_ylabel("u(x)")
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.semilogy(range(iterations_J), convergence_J, "o")
axes.set_title("Convergence of Jacobi Iterations for Poisson Problem")
axes.set_xlabel("Iteration")
axes.set_ylabel("$||U^{(k)} - U^{(k-1)}||_2$")
plt.show()A slight modification to the above leads also to the Gauss-Seidel method. Programmtically it is easy to see the modification but in the iteration above we now will have
# Problem setup
a = 0.0
b = 1.0
u_a = 0.0
u_b = 3.0
f = lambda x: numpy.exp(x)
u_true = lambda x: (4.0 - numpy.exp(1.0)) * x - 1.0 + numpy.exp(x)
# Descretization
N = 100
x_bc = numpy.linspace(a, b, N + 2)
x = x_bc[1:-1]
delta_x = (b - a) / (N + 1)
# Expected iterations needed
iterations_GS = int(
2.0 * numpy.log(delta_x) / numpy.log(1.0 - numpy.pi**2 * delta_x**2)
)
# Solve system
# Initial guess for iterations
U = numpy.zeros(N + 2)
U[0] = u_a
U[-1] = u_b
convergence_GS = numpy.zeros(iterations_GS)
success = False
for k in range(iterations_GS):
for i in range(1, N + 1):
U[i] = 0.5 * (U[i + 1] + U[i - 1]) - f(x_bc[i]) * delta_x**2 / 2.0
convergence_GS[k] = numpy.linalg.norm(u_true(x_bc) - U, ord=2)
# Plot result
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.plot(x_bc, U, "o", label="Computed")
axes.plot(x_bc, u_true(x_bc), "k", label="True")
axes.set_title("Solution to $u_{xx} = e^x$")
axes.set_xlabel("x")
axes.set_ylabel("u(x)")
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.semilogy(range(iterations_GS), convergence_GS, "o")
axes.set_title("Convergence of Gauss-Seidel Iterations for Poisson Problem")
axes.set_xlabel("Iteration")
axes.set_ylabel("$||U^{(k)} - U^{(k-1)}||_2$")
plt.show()Matrix Splitting Methods¶
One way to view Jacobi and Gauss-Seidel is as a splitting of the matrix so that
Then the system can be viewed as
Viewing this instead as an iteration we have then
The goal then would be to pick and such that contains as much of as possible while remaining easier to solve than the original system.
The resulting update for each of these then becomes
where is called the iteration matrix and . We also want
where is the true solution of the original , in other words is the fixed point of the iteration. Is this fixed point stable though? If the spectral radius we can show that in fact the iteration is stable.
For Jacobi the splitting is
(sticking to the Poisson problem). is now a diagonal matrix and easy to solve.
For Gauss-Seidel we have
Stopping Criteria¶
How many iterations should we perform? Let represent the error present at step . If we want to reduce the error at the first step to order then we have
Under suitable assumption we can bound the error in the 2-norm as
Moving back to our estimate of the number of iterations we can combine our two expressions involving the error by taking which allows us to write
taking into account error convergence.
Picking is a bit tricky but one natural criteria to use would be since our original discretization was 2nd-order accurate. This leads to
This also allows us to estimate the total number of operations that need to be used.
For Jacobi we have the spectral radius of as
so that
where here is now the number of points used.
Combining this with the previous operation count per iteration we find that Jacobi would lead to work which is not very promising.
For two dimensions we have so even compared to Gaussian elimination this approach is not ideal.
What about Gauss-Seidel? Here the spectral radius is approximately
so that
which still does not lead to any advantage over direct solvers. It does show that Gauss-Seidel does actually converge faster due to the factor of 2 difference between and .
Successive Overrelaxation (SOR)¶
Well that’s a bit dissapointing isn’t it? These iterative schemes do not seem to be worth much but it turns out we can do better with a slight modification to Gauss-Seidel.
If you look at Gauss-Seidel iteration it turns out it moves in the correct direction to but is very conservative in the amount. If instead we do
where we get to pick we can do much better.
If then we get back Gauss-Seidel.
If we move even less and converges even more slowly (although is sometimes used for multigrid under the name underrelaxation).
If then we move further than Gauss-Seidel suggests and any method where is known as successive overrelaxation (SOR).
We can write this as a matrix splitting method as well. We can combine the two-step formula above to find
corresponding to a matrix splitting of
where is the diagonal of the matrix , and and are the lower and upper triangular parts without the diagonal of .
It can be shown that picking an such that the SOR method converges.
It turns out we can also find an optimal for a wide class of problems. For Poisson problems in any number of space dimensions for instance it can be shown that SOR method converges optimaly if
What about the number of iterations? We can follow the same tactic as before with the spectral radius of now
# Problem setup
a = 0.0
b = 1.0
u_a = 0.0
u_b = 3.0
f = lambda x: numpy.exp(x)
u_true = lambda x: (4.0 - numpy.exp(1.0)) * x - 1.0 + numpy.exp(x)
# Descretization
N = 100
x_bc = numpy.linspace(a, b, N + 2)
x = x_bc[1:-1]
delta_x = (b - a) / (N + 1)
# SOR parameter
omega = 2.0 / (1.0 + numpy.sin(numpy.pi * delta_x))
# Expected iterations needed
iterations_SOR = int(2.0 * numpy.log(delta_x) / numpy.log(omega - 1.0))
# Solve system
# Initial guess for iterations
U = numpy.zeros(N + 2)
U[0] = u_a
U[-1] = u_b
convergence_SOR = numpy.zeros(iterations_SOR)
for k in range(iterations_SOR):
for i in range(1, N + 1):
U_gs = 0.5 * (U[i - 1] + U[i + 1] - delta_x**2 * f(x_bc[i]))
U[i] += omega * (U_gs - U[i])
convergence_SOR[k] = numpy.linalg.norm(u_true(x_bc) - U, ord=2)
# Plot result
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.plot(x_bc, U, "o", label="Computed")
axes.plot(x_bc, u_true(x_bc), "k", label="True")
axes.set_title("Solution to $u_{xx} = e^x$")
axes.set_xlabel("x")
axes.set_ylabel("u(x)")
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.semilogy(range(iterations_SOR), convergence_SOR, "o")
axes.set_title("Convergence of SOR Iterations for Poisson Problem")
axes.set_xlabel("Iteration")
axes.set_ylabel("$||U^{(k)} - U^{(k-1)}||_2$")
plt.show()# Plotting all the convergence rates
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.semilogy(
range(iterations_SOR), convergence_J[:iterations_SOR], "r", label="Jacobi"
)
axes.semilogy(
range(iterations_SOR), convergence_GS[:iterations_SOR], "b", label="Gauss-Seidel"
)
axes.semilogy(range(iterations_SOR), convergence_SOR, "k", label="SOR")
axes.legend(loc=3)
axes.set_title("Comparison of Convergence Rates")
axes.set_xlabel("Iteration")
axes.set_ylabel("$||U^{(k)} - U^{(k-1)}||_2$")
plt.show()Descent Methods¶
One special case of matrices are amenable to another powerful way to iterate to the solution. A matrix is said to be symmetric positive definite (SPD) if
Now define a function such that
This is a quadratic function in the variables and in the case where forms a parabolic bowl. Since this is a quadratic function there is a unique minimum, .
def phi(X, Y, A, f):
return (
0.5 * (A[0, 0] * X**2 + A[0, 1] * X * Y + A[1, 0] * X * Y + A[1, 1] * Y**2)
- X * f[0]
- Y * f[1]
)
x = numpy.linspace(-15, 15, 100)
y = numpy.linspace(-15, 15, 100)
X, Y = numpy.meshgrid(x, y)
f = numpy.array([1.0, 2.0])
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 3)
axes = fig.add_subplot(1, 3, 1, aspect="equal")
A = numpy.identity(2)
axes.contour(X, Y, phi(X, Y, A, f), 25)
axes = fig.add_subplot(1, 3, 2, aspect="equal")
A = numpy.array([[2, -1], [-1, 2]])
axes.contour(X, Y, phi(X, Y, A, f), 25)
axes = fig.add_subplot(1, 3, 3, aspect="equal")
A = numpy.array([[2, -1.8], [-1.7, 2]])
axes.contour(X, Y, phi(X, Y, A, f), 25)
plt.show()What property of the matrix simplifies the expression above?
Symmetry! This implies that and the expression above simplifies to
Now write two expressions that when evaluated at are identically 0 that express that minimizes .
Since minimizes we know that the first derivatives should be zero at the minimum:
Note that these equations can be rewritten as
The conclusion here then is that finding the minimum of is equivalent to solving the system ! This is a common type of reformulation for many problems where it may be easier to treat a given equation as a minimization problem rather than directly solve it.
Note that this is not quite the matrix that we have been using for our Poisson problem so far which is actually symmetric negative definite although these same methods work as well. In this case we actually want to find the maximum of instead, other than that everything is the same.
Also note that if is indefinite then the eigenvalues of will change sign and instead of a stable minimum or maximum we have a saddle point which are much more difficult to handle (GMRES can for instance).
Method of Steepest Descent¶
So now we turn to finding the that minimizes the function . The simplest approach to this is called the method of steepest descent which finds the direction of the largest gradient of and goes in that direction.
We can find by
i.e. the that takes us just far enough so that if we went any further would increase.
This implies that and only if we are at the minimum of . We can compute the gradient of as
where is the residual defined as
Looking back at the definition of then leads to the conclusion that the that would minimize the expression would be the one that satisfies
To find this note that
so that the derivative becomes
Setting this to zero than leads to
# Problem setup
a = 0.0
b = 1.0
u_a = 0.0
u_b = 3.0
f = lambda x: numpy.exp(x)
u_true = lambda x: (4.0 - numpy.exp(1.0)) * x - 1.0 + numpy.exp(x)
# Descretization
N = 50
x_bc = numpy.linspace(a, b, N + 2)
x = x_bc[1:-1]
delta_x = (b - a) / (N + 1)
# Construct matrix A
A = numpy.zeros((N, N))
diagonal = numpy.ones(N) / delta_x**2
A += numpy.diag(diagonal * 2.0, 0)
A += numpy.diag(-diagonal[:-1], 1)
A += numpy.diag(-diagonal[:-1], -1)
# Construct right hand side
b = -f(x)
b[0] += u_a / delta_x**2
b[-1] += u_b / delta_x**2
# Algorithm parameters
MAX_ITERATIONS = 10000
tolerance = 1e-3
# Solve system
U = numpy.empty(N)
convergence_SD = numpy.zeros(MAX_ITERATIONS)
success = False
for k in range(MAX_ITERATIONS):
r = b - numpy.dot(A, U)
if numpy.linalg.norm(r, ord=2) < tolerance:
success = True
break
alpha = numpy.dot(r, r) / numpy.dot(r, numpy.dot(A, r))
U = U + alpha * r
convergence_SD[k] = numpy.linalg.norm(u_true(x) - U, ord=2)
if not success:
print("Iteration failed to converge!")
print(convergence_SD[-1])
else:
# Plot result
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.plot(x, U, "o", label="Computed")
axes.plot(x_bc, u_true(x_bc), "k", label="True")
axes.set_title("Solution to $u_{xx} = e^x$")
axes.set_xlabel("x")
axes.set_ylabel("u(x)")
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.semilogy(range(k), convergence_SD[:k], "o")
axes.set_title("Convergence of Steepest Descent for Poisson Problem")
axes.set_xlabel("Iteration")
axes.set_ylabel("$||U^{(k)} - U^{(k-1)}||_2$")
plt.show()Convergence of Steepest Descent¶
What controls the convergence of steepest descent? It turns out that the shape of the parabolic bowl formed by is the major factor determining the convergence of steepest descent. For example, if is a scalar multiple of the identity than these ellipses are actually circles and steepest descent converges in steps. If does not lead to circles, the convergence is based on the ratio between the semi-major and semi-minor axis of the resulting ellipses dimensional ellipses. This is controlled by the smallest and largest eigenvalues of the matrix hence why steepest descent grows increasingly difficult as increases for the Poisson problem. Note that this also relates to the condition number of the matrix in the norm.

def steepest_descent(A, U, b, axes):
MAX_ITERATIONS = 10000
tolerance = 1e-3
success = False
for k in range(MAX_ITERATIONS):
axes.text(U[0] + 0.1, U[1] + 0.1, str(k), fontsize=12)
axes.plot(U[0], U[1], "ro")
r = b - numpy.dot(A, U)
if numpy.linalg.norm(r, ord=2) < tolerance:
success = True
break
alpha = numpy.dot(r, r) / numpy.dot(r, numpy.dot(A, r))
U = U + alpha * r
if success:
return k
else:
raise Exception("Iteration did not converge.")
phi = (
lambda X, Y, A: 0.5
* (A[0, 0] * X**2 + A[0, 1] * X * Y + A[1, 0] * X * Y + A[1, 1] * Y**2)
- X * f[0]
- Y * f[1]
)
x = numpy.linspace(-15, 15, 100)
y = numpy.linspace(-15, 15, 100)
X, Y = numpy.meshgrid(x, y)
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
# fig.set_figwidth(fig.get_figwidth() * 3)
# axes = fig.add_subplot(1, 3, 1, aspect='equal')
axes = fig.add_subplot(1, 1, 1, aspect="equal")
A = numpy.identity(2)
f = numpy.array([0.0, 0.0])
axes.contour(X, Y, phi(X, Y, A), 25)
print("Iteration count: %s" % steepest_descent(A, numpy.array([-10.0, -12.0]), f, axes))
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
# axes = fig.add_subplot(1, 3, 2, aspect='equal')
axes = fig.add_subplot(1, 1, 1, aspect="equal")
A = numpy.array([[2, -1], [-1, 2]])
f = numpy.array([1.0, 2.0])
axes.contour(X, Y, phi(X, Y, A), 25)
print("Iteration count: %s" % steepest_descent(A, numpy.array([-10.0, -12.0]), f, axes))
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
axes = fig.add_subplot(1, 1, 1, aspect="equal")
# axes = fig.add_subplot(1, 3, 3, aspect='equal')
A = numpy.array([[2, -1.8], [-1.7, 2]])
f = numpy.array([1.0, 2.0])
axes.contour(X, Y, phi(X, Y, A), 25)
print("Iteration count: %s" % steepest_descent(A, numpy.array([-10.0, -12.0]), f, axes))-Conjugate Search Directions and Conjugate Gradient¶
An alternative to steepest descent is to choose a slightly different direction to descend down. Generalizing our step from above let the iterative scheme be
and as before we want to pick an such that
leading again to the choice of of
accept now we are also allowed to pick the search direction .
Ways to choose :
One bad choice for would be orthogonal to since this would then be tangent to the level set (ellipse) of and would cause it to only increase so we want to make sure that (the inner-product).
We also want to still move downwards so require that .
We know that is not always the best direction to go in but what might be better? We could head directly for the minimum but how do we do that without first knowing ?
It turns out when we can do this from any initial guess and initial direction we will arrive at the minimum in 2 steps if we choose the second search direction dependent on
In general if two vectors satisfy this property they are said to be -conjugate.
Note that if then these two vectors would be orthogonal to each other and in this sense -conjugacy is a natural extension from orthogonality and the simple case from before where the ellipses are circles to the case where we can have very distorted ellipses.
In fact the vector is tangent to the level set that lies on and therefore choosing so that it is -conjugate to then always heads to the center of the ellipse.
In other words, once we know a tangent to one of the ellipses we can always choose a direction that minimizes in one of the dimensions of the search space. Choosing the iteratively this way forms the basis of the conjugate gradient method.

Now to generalize beyond consider the case. As stated before we are now in a three-dimensional space where the level-sets are concentric ellipsoids. Taking a slice through this space will lead to an ellipse on the slice.
Start with an initial guess and choose a search direction .
Minimize in the direction resulting in the choice
as we saw before. Now set .
Choose to be -conjugate to . In this case there are an infinite set of vectors that are possible that satisfy . Beyond requiring to be -conjugate we also want it to be linearly-independent to .
Again choose an that minimizes the residual (again tangent to the level sets of ) in the direction and repeat the process
# Problem setup
a = 0.0
b = 1.0
u_a = 0.0
u_b = 3.0
f = lambda x: numpy.exp(x)
u_true = lambda x: (4.0 - numpy.exp(1.0)) * x - 1.0 + numpy.exp(x)
# Descretization
N = 50
x_bc = numpy.linspace(a, b, N + 2)
x = x_bc[1:-1]
delta_x = (b - a) / (N + 1)
# Construct matrix A
A = numpy.zeros((N, N))
diagonal = numpy.ones(N) / delta_x**2
A += numpy.diag(diagonal * 2.0, 0)
A += numpy.diag(-diagonal[:-1], 1)
A += numpy.diag(-diagonal[:-1], -1)
# Construct right hand side
b = -f(x)
b[0] += u_a / delta_x**2
b[-1] += u_b / delta_x**2
# Algorithm parameters
MAX_ITERATIONS = 2 * N
tolerance = delta_x**2
# Solve system
U = numpy.zeros(N)
convergence_CG = numpy.zeros(MAX_ITERATIONS)
success = False
r = numpy.dot(A, U) - b
p = -r
r_prod_prev = numpy.dot(r, r)
for k in range(MAX_ITERATIONS):
w = numpy.dot(A, p)
alpha = r_prod_prev / numpy.dot(p, w)
U += alpha * p
r += alpha * w
beta = numpy.dot(r, r) / r_prod_prev
r_prod_prev = numpy.dot(r, r)
if numpy.linalg.norm(r, ord=2) < tolerance:
success = True
break
p = beta * p - r
convergence_CG[k] = numpy.linalg.norm(u_true(x) - U, ord=2)
if not success:
print("Iteration failed to converge!")
else:
# Plot result
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.plot(x, U, "o", label="Computed")
axes.plot(x_bc, u_true(x_bc), "k", label="True")
axes.set_title("Solution to $u_{xx} = e^x$")
axes.set_xlabel("x")
axes.set_ylabel("u(x)")
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.semilogy(convergence_CG[: k + 2], "o")
axes.set_title("Convergence of Conjugate Gradient for Poisson Problem")
axes.set_xlabel("Iteration")
axes.set_ylabel("$||U^{(k)} - U^{(k-1)}||_2$")
plt.show()def CG(A, U, b, axes):
MAX_ITERATIONS = 10000
tolerance = 1e-3
success = False
r = numpy.dot(A, U) - b
p = -r
r_prod_prev = numpy.dot(r, r)
axes.text(U[0] + 0.1, U[1] + 0.1, str(0), fontsize=12)
axes.plot(U[0], U[1], "ro")
for k in range(MAX_ITERATIONS):
w = numpy.dot(A, p)
alpha = r_prod_prev / numpy.dot(p, w)
U += alpha * p
axes.text(U[0] + 0.1, U[1] + 0.1, str(k + 1), fontsize=12)
axes.plot(U[0], U[1], "ro")
r += alpha * w
beta = numpy.dot(r, r) / r_prod_prev
r_prod_prev = numpy.dot(r, r)
if numpy.linalg.norm(r, ord=2) < tolerance:
success = True
break
p = beta * p - r
if success:
return k
else:
raise Exception("Iteration did not converge.")
phi = (
lambda X, Y, A: 0.5
* (A[0, 0] * X**2 + A[0, 1] * X * Y + A[1, 0] * X * Y + A[1, 1] * Y**2)
- X * f[0]
- Y * f[1]
)
x = numpy.linspace(-15, 15, 100)
y = numpy.linspace(-15, 15, 100)
X, Y = numpy.meshgrid(x, y)
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
# fig.set_figwidth(fig.get_figwidth() * 3)
# axes = fig.add_subplot(1, 3, 1, aspect='equal')
axes = fig.add_subplot(1, 1, 1, aspect="equal")
A = numpy.identity(2)
f = numpy.array([0.0, 0.0])
axes.contour(X, Y, phi(X, Y, A), 25)
print("Iteration count: %s" % CG(A, numpy.array([-10.0, -12.0]), f, axes))
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
# axes = fig.add_subplot(1, 3, 2, aspect='equal')
axes = fig.add_subplot(1, 1, 1, aspect="equal")
A = numpy.array([[2, -1], [-1, 2]])
f = numpy.array([1.0, 2.0])
axes.contour(X, Y, phi(X, Y, A), 25)
print("Iteration count: %s" % CG(A, numpy.array([-10.0, -12.0]), f, axes))
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
axes = fig.add_subplot(1, 1, 1, aspect="equal")
# axes = fig.add_subplot(1, 3, 3, aspect='equal')
A = numpy.array([[2, -1.8], [-1.7, 2]])
f = numpy.array([1.0, 2.0])
axes.contour(X, Y, phi(X, Y, A), 25)
print("Iteration count: %s" % CG(A, numpy.array([-10.0, -12.0]), f, axes))
plt.show()Preconditioning¶
One way to get around the difficulties with these types of methods due to the distortion of the ellipses (and consequently the conditioning of the matrix) is to precondition the matrix. The basic idea is that we take our original problem and instead solve
Note that since we need to find the inverse of , this matrix should be nice. A couple of illustrative examples may help to illustrate why this might be a good idea:
If then we essentially have solved our problem already although that does not help us much
If , then is easily computed and it turns out for some problems this can decrease the condition number of significantly. Note though that this is not actually helpful in the case of the Poisson problem.
If is based on another iterative method used on , for instance Gauss-Seidel, these can be effective general use preconditioners for many problems.
So the next question then becomes how to choose a preconditioner. This is usually very problem specific and a number of papers suggest strategies for particular problems.
