Skip to article frontmatterSkip to article content

Numerical Solution to ODE Initial Value Problems - Part 2

Text provided under a Creative Commons Attribution license, CC-BY. All code is made available under the FSF-approved MIT license. (c) Kyle T. Mandli

Note: The presentation below largely follows part II in “Finite Difference Methods for Ordinary and Partial Differential Equations” by LeVeque (SIAM, 2007).

from __future__ import print_function

%matplotlib inline
import matplotlib.pyplot as plt
import numpy

Numerical Solution to ODE Initial Value Problems - Part 2

Overview -- so far

So far we have discussed 3 basic techniques for integration of ODE IVP’s

  • Single-Step Multi-Stage schemes (explicit and implicit)

  • Taylor’s Series Methods

  • Linear Multi-Step Schemes (explicit and implicit)

as well as

  • truncation error of each method (and its relation to step-error)

  • adaptive stepping

Overview -- to do

This notebook will continue the analysis of these methods to discuss the other important issues of

  • Convergence (Global Error) -- does the discrete solution converge to the continuous analytic solution?

  • Stability -- Are there limits to how large a time-step we can take?

  • Stiff Equations: consequences of stability

  • Implicit Methods: Solution of non-linear systems of equation using Newton’s Method

Convergence

We can think of an ODE method as providing a sequence of approximations UNU_N where here NN is the number of time steps needed to reach the final time of interest tft_f, in effect we are increasing the resolution of the method. We then say that a method is convergent if this sequence converges to the true solution at the same time

limNUN=u(tf).\lim_{N\rightarrow \infty} U_N = u(t_f).

We can also define this in a more familiar way in terms of Δt\Delta t such that

NΔt=tf        N=tfΔtN \Delta t = t_f ~~~~ \Rightarrow ~~~~ N = \frac{t_f}{\Delta t}

so that our definition of convergence becomes

limΔt0UΔt=u(tf).\lim_{\Delta t \rightarrow 0} U_{\Delta t} = u(t_f).

In general for a method to be convergent it must be

  • consistent which as before meant that the local truncation error T=O(Δtp)T = \mathcal{O}(\Delta t^p) where p>0p > 0,

  • zero-stable which implies that the sum total of the errors as Δt0\Delta t \rightarrow 0 is bounded and has the same order as TT (the truncation error) which we know goes to zero as Δt0\Delta t \rightarrow 0.

put another way, a method is convergent if the global error

E=UN(tf)u(tf)0asNE = | U_N(t_f) - u(t_f)| \rightarrow 0\quad\mathrm{as}\quad N\rightarrow\infty

Example: Forward Euler on a Linear Problem

Consider the simple linear problem

dudt=λuwithu(0)=u0\frac{\text{d}u}{\text{d}t} = \lambda u \quad \text{with} \quad u(0) = u_0

which we know has the solution u(t)=u0eλtu(t) = u_0 e^{\lambda t}. Applying Euler’s method to this problem leads to

Un+1=Un+ΔtλUn=(1+Δtλ)Un\begin{aligned} U_{n+1} &= U_n + \Delta t\lambda U_n \\ &= (1 + \Delta t \lambda) U_n \end{aligned}

We also know the local truncation error is defined by

Tn=1Δt[Un+1un+1]=1Δt[(1+λΔt)unun+1]\begin{aligned} T_n &= \frac{1}{\Delta t} \left[ U_{n+1} - u_{n+1} \right ] \\ &= \frac{1}{\Delta t} \left[ (1 + \lambda \Delta t) u_n - u_{n+1} \right ] \end{aligned}

which can be rearranged to find

un+1=(1+Δtλ)unΔtTn.u_{n+1} = (1 + \Delta t \lambda) u_n - \Delta t T_n.

Note that all values here are in terms of the exact solution where as the application of Euler’s method is in terms of the approximate solution.

Defining the global error as

En=unUnE_{n} = u_n - U_n

we can subtract the last two expressions to find

un+1Un+1=(1+Δtλ)unΔtTnun+1(1+Δtλ)UnUn+1En+1=(1+Δtλ)(unUn)ΔtTn=(1+Δtλ)EnΔtTn,\begin{aligned} u_{n+1} - U_{n+1} &= \underbrace{(1 + \Delta t \lambda) u_n - \Delta t T_n}_{u_{n+1}} - \underbrace{(1 + \Delta t \lambda) U_n}_{U_{n+1}} \\ E_{n+1} &= (1 + \Delta t \lambda) (u_n - U_n) - \Delta t T_n \\ &= (1 + \Delta t \lambda) E_n - \Delta t T_n, \end{aligned}

a recursive definition for the global error that connects it to the local truncation error.

Working backwards from the global error EnE_n at maximum time step n=tf/Δtn = t_f/\Delta t, the first few iterates of this look like

En=(1+λΔt)En1ΔtTn1=(1+λΔt)[(1+λΔt)En2ΔtTn2]ΔtTn1=(1+λΔt)[(1+λΔt){(1+λΔt)En3ΔtTn3}ΔtTn2]ΔtTn1\begin{align} E_n &= (1 + \lambda \Delta t) E_{n - 1} - \Delta t T_{n-1} \\ &= (1 + \lambda \Delta t) \left [(1 + \lambda \Delta t) E_{n - 2} - \Delta t T_{n-2} \right ] - \Delta t T_{n-1} \\ &= (1 + \lambda \Delta t) \left [(1 + \lambda \Delta t) \left \{(1 + \lambda \Delta t) E_{n - 3} - \Delta t T_{n-3} \right \} - \Delta t T_{n-2} \right ] - \Delta t T_{n-1} \end{align}

Which at this point looks like

En=(1+λΔt)3En3Δti=13(1+Δtλ)i1TniE_n = (1 + \lambda \Delta t)^3E_{n-3} - \Delta t\sum_{i=1}^3 (1+\Delta t\lambda)^{i-1}T_{n-i}

So taking it back nn steps (3n3\rightarrow n) gives

En=(1+λΔt)nE0Δti=1n(1+Δtλ)i1TniE_n = (1 + \lambda \Delta t)^nE_0 - \Delta t\sum_{i=1}^n (1+\Delta t\lambda)^{i-1}T_{n-i}

Expanding this expression out backwards in time to n=0n=0 leads to

En=(1+Δtλ)nE0Δti=1n(1+Δtλ)i1Tni.E_n = (1 + \Delta t \lambda)^n E_0 - \Delta t \sum^n_{i=1} (1 + \Delta t \lambda)^{i-1} T_{n - i}.

Noting that (1+Δtλ)(1 + \Delta t \lambda) are the first two terms in the Taylor series for eλΔte^{\lambda\Delta t} we will bound this term by

eΔtλ1+Δtλe^{\Delta t \lambda} \geq |1 + \Delta t \lambda|

which then implies the term in the summation can be bounded by

1+Δtλn1e(n1)ΔtλenΔtλeλtf|1 + \Delta t \lambda|^{n - 1} \leq e^{(n-1) \Delta t |\lambda|} \leq e^{n \Delta t |\lambda|} \leq e^{|\lambda| t_f}

Using this expression in the expression for the global error we find

En=(1+Δtλ)nE0Δti=1n(1+Δtλ)i1TniEneλΔtnE0+Δti=1neλtfTnieλtf(E0+Δti=1nTni)eλtf(E0+nΔtmax1inTni)\begin{aligned} E_n &= (1 + \Delta t \lambda)^n E_0 - \Delta t \sum^n_{i=1} (1 + \Delta t \lambda)^{i-1} T_{n-i} \\ |E_n| &\leq e^{|\lambda| \Delta t n} |E_0| + \Delta t \sum^n_{i=1} e^{|\lambda| t_f} |T_{n-i}| \\ &\leq e^{|\lambda| t_f} \left(|E_0| + \Delta t \sum^n_{i=1} |T_{n - i}|\right) \\ &\leq e^{|\lambda| t_f} \left(|E_0| + n \Delta t \max_{1 \leq i \leq n} |T_{n - i}|\right) \end{aligned}

or

Eneλtf(E0+tfmax1inTni)\begin{aligned} E_n &\leq e^{|\lambda| t_f} \left(|E_0| + t_f \max_{1 \leq i \leq n} |T_{n - i}|\right) \end{aligned}

In other words the global error is bounded by the original global error and the maximum one-step truncation error made multiplied by the number of time steps taken. If N=tfΔtN = \frac{t_f}{\Delta t} as before and taking into account the local truncation error we can simplify this expression further to

Eneλtf[E0+tf(12Δtu+O(Δt2))]|E_n| \leq e^{|\lambda| t_f} \left[|E_0| + t_f \left(\frac{1}{2} \Delta t |u''| + \mathcal{O}(\Delta t^2)\right ) \right]

If we assume that we have used the correct initial condition u0u_0 then E00E_0 \rightarrow 0 as Δt0\Delta t \rightarrow 0 and we see that the global error EnE_n is bounded by O(Δt)\mathcal{O}(\Delta t), the same as the local truncation error:

Eneλtftf(12Δtu+O(Δt2))=O(Δt).|E_n| \leq e^{|\lambda| t_f} t_f \left(\frac{1}{2} \Delta t |u''| + \mathcal{O}(\Delta t^2)\right ) = \mathcal{O}(\Delta t).

More generally, it can be shown that for all single-step multi-stage schemes (of which Euler is the simplest), the Global error has the same order as the Truncation error, all of which scale as

Δtp,p>0\Delta t^p,\quad p>0

and therefore all of these schemes converge.

Absolute Stability

Although zero-stability guarantees stability it is much more difficult to work with in general as the limit Δt0\Delta t \rightarrow 0 can be difficult to compute (and in general isn’t ideal as it requires an infinite amount of computation). Instead we often consider a finite Δt\Delta t and examine if the method is stable for this particular choice of Δt\Delta t. This has the practical upside that it will also tell us what particular Δt\Delta t will ensure that our method is indeed stable.

Example

Consider the problem

u(t)=λ(ucost)sint    with    u(0)=1u'(t) = \lambda (u - \cos t) - \sin t ~~~~ \text{with} ~~~~ u(0) = 1

whose exact solution is

u(t)=cost.u(t) = \cos t.

We can compute an estimate for what Δt\Delta t we need to use by examining the truncation error for Euler’s method

T=12Δtu(t)+O(Δt2)=12Δtcost+O(Δt2)\begin{aligned} T &= \frac{1}{2} \Delta t u''(t) + \mathcal{O}(\Delta t^2) \\ &= -\frac{1}{2} \Delta t \cos t + \mathcal{O}(\Delta t^2) \end{aligned}

and therefore

EnΔtmax0ttfcost=Δt.|E_n| \leq \Delta t \max_{0 \leq t \leq t_f} |\cos t| = \Delta t.

If we want a solution where En<103|E_n| < 10^{-3} then Δt103\Delta t \approx 10^{-3}. Turning to the application of Euler’s method lets apply this to the case where λ=10\lambda = -10 and λ=2100\lambda = -2100.

# Implement vectorized Forward Euler
def euler(f, t_span, u0, N):
    """simple implementation of constant step-size forward euler method for vector valued f
    This doc string should have so much more in it
    """
    t = numpy.linspace(t_span[0], t_span[1], N)
    if numpy.isscalar(u0):
        u = numpy.empty(t.shape)
    else:
        u = numpy.empty((len(t), len(u0)))
    u[0] = u0
    delta_t = t[1] - t[0]
    for n, t_n in enumerate(t[:-1]):
        K1 = delta_t * f(t_n, u[n])
        u[n + 1] = u[n] + K1
    return t, u.transpose()
# Implement vectorized RK4
def RK4(f, t_span, u0, N):
    """simple implementation of constant step-size RK4 scheme for vector valued f
    This doc string should have so much more in it
    """
    t = numpy.linspace(t_span[0], t_span[1], N)
    if numpy.isscalar(u0):
        u = numpy.empty(t.shape)
    else:
        u = numpy.empty((len(t), len(u0)))
    u[0] = u0
    delta_t = t[1] - t[0]
    for n, t_n in enumerate(t[:-1]):
        K1 = delta_t * f(t_n, u[n])
        K2 = delta_t * f(t_n + delta_t / 2.0, u[n] + K1 / 2.0)
        K3 = delta_t * f(t_n + delta_t / 2.0, u[n] + K2 / 2.0)
        K4 = delta_t * f(t_n + delta_t, u[n] + K3)
        u[n + 1] = u[n] + (K1 + 2.0 * (K2 + K3) + K4) / 6.0
    return t, u.transpose()
# Compare accuracy between Euler for different values of lambda
f = lambda t, u, lam: lam * (u - numpy.cos(t)) - numpy.sin(t)
u_exact = lambda t: numpy.cos(t)
u_0 = u_exact(0.0)

t_span = [0.0, 2.0]
num_steps = [2**n for n in range(4, 9)]
# num_steps = [2**n for n in range(15,20)]
delta_t = numpy.empty(len(num_steps))
error_10 = numpy.empty(len(num_steps))
error_2100 = numpy.empty(len(num_steps))

t_f = t_span[1]
u_f = u_exact(t_f)

for i, N in enumerate(num_steps):
    # Compute Euler solution
    ff = lambda t, u: f(t, u, -10.0)
    t, U_10 = euler(ff, t_span, u_0, N)
    delta_t[i] = t[1] - t[0]
    error_10[i] = numpy.abs(U_10[-1] - u_f) / numpy.abs(u_f)
    ff = lambda t, u: f(t, u, -2100.0)
    t, U_2100 = euler(ff, t_span, u_0, N)
    error_2100[i] = numpy.abs(U_2100[-1] - u_f) / numpy.abs(u_f)

print("Error: lambda =   10: {}".format(error_10))
print("Error: lambda = 2100: {}".format(error_2100))

# Plot error vs. delta_t
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
axes = fig.add_subplot(1, 2, 1)

axes.loglog(delta_t, error_10, "bo", label="Forward Euler")

order_C = lambda delta_x, error, order: numpy.exp(
    numpy.log(error) - order * numpy.log(delta_x)
)
axes.loglog(
    delta_t,
    order_C(delta_t[1], error_10[1], 1.0) * delta_t**1.0,
    "r--",
    label="1st Order",
)

axes.legend(loc=4)
axes.set_title("Error - $\lambda = 10$")
axes.set_xlabel("$\Delta t$")
axes.set_ylabel("$|U(t_f) - u(t_f)|$")

axes = fig.add_subplot(1, 2, 2)

axes.loglog(delta_t, error_2100, "bo", label="Forward Euler")
axes.loglog(
    delta_t,
    order_C(delta_t[1], error_2100[1], 1.0) * delta_t**1.0,
    "r--",
    label="1st Order",
)

axes.set_title("Error - $\lambda = 2100$")
axes.set_xlabel("$\Delta t$")
axes.set_ylabel("$|U(t_f) - u(t_f)|$")

plt.show()

So what went wrong with λ=2100\lambda = -2100, the global error should go as

En=O(Δt)?E_n = \mathcal{O}(\Delta t)?

If Δt103\Delta t \approx 10^{-3} then for the case λ=10\lambda = -10 the previous global error is multiplied by 1+λΔt1+\lambda\Delta t

1+10310=0.991 + 10^{-3} \cdot -10 = 0.99

which means the contribution from EnE^n will slowly decrease as we take more time steps. For the other case we have

1+1032100=1.11 + 10^{-3} \cdot -2100 = -1.1

which means that for this Δt\Delta t the error made in previous time steps will grow! For this not to happen we would have to have Δt<1/2100\Delta t < 1 / 2100 which would lead to convergence again.

A simple example

Consider our simplest test problem

u(t)=λu,u(0)=u0u'(t) = \lambda u, \quad u(0) = u_0

whose exact solution is

u(t)=u0eλt.u(t) = u_0 e^{\lambda t}.

which for real λ<0\lambda < 0 should decay to zero as tt\rightarrow\infty

Question?

What is the largest step-size Δt\Delta t for an Euler method, such that the discrete solution UN0U_N\rightarrow 0 as NN\rightarrow\infty?

Euler’s method on this simplest problem can be written

Un+1=Un+Δtf(Un)=Un+ΔtλUn=(1+λΔt)Un\begin{align} U_{n+1} &= U_n + \Delta t f(U_n) = U_n + \Delta t \lambda U_n\\ &= (1 + \lambda\Delta t)U_n\\ \end{align}

i.e. at each step the last solution is multiplied by the factor (1+λΔt)(1 + \lambda\Delta t)

After NN steps

UN=(1+λΔt)NU0U_N = (1 + \lambda\Delta t)^N U_0

Assuming λ<0\lambda < 0 what is the largest step size Δt\Delta t such that UN0U_N\rightarrow 0 as NN\rightarrow\infty?

Hopefully it’s clear that for this scheme to be stable requires

1+λΔt<1|1 + \lambda\Delta t| < 1

The key parameter is actually the combination of λ\lambda and the time step z=λΔtz=\lambda\Delta t, so stability requires that

1+z<1| 1 + z | < 1

or 𝑧∈[?,?]

plt.figure(figsize=(8, 6))
z = numpy.linspace(-3, 2, 200)
plt.plot(z, numpy.abs(1 + z))
plt.plot(z, numpy.ones(z.shape), "k--")
plt.plot([0.0, 0], [0.0, 3.0])
plt.xlabel("$z$", fontsize=16)
plt.ylabel("$|1 + z|$", fontsize=16)
plt.grid()
plt.show()

Given z[2,0]z\in[-2,0], the time step is simply

Δt=zλ\Delta t = \frac{z}{\lambda}

so the maximum time step we can take for λ<0\lambda<0 is

Δt=2λ>0\Delta t = -\frac{2}{\lambda} > 0
# Compare accuracy between Euler for different values of time steps and $\lambda = -1$
lam = -1.0
f = lambda t, u: lam * u
u_exact = lambda t: numpy.exp(lam * t)
u_0 = u_exact(0.0)

t_span = [0.0, 10.0]

N = 21
N = 11
N = 9
# N = 6

N = 5
abs_z = numpy.abs(lam * t_span[-1] / (N - 1))
print("Abs(z)={}".format(abs_z))
t, u = euler(f, t_span, u_0, N)
tt = numpy.linspace(0.0, t_span[1], 100)
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)
axes.plot(tt, u_exact(tt), label="$u_{{exact}}$")
axes.plot(t, u, "ro-", label="$u_{{euler}}$")
axes.set_xlabel("t")
axes.set_title("$\Delta t = {}$, $|z|={}$".format(t[1] - t[0], abs_z), fontsize=18)
axes.grid()
axes.legend(loc="best", fontsize=14)
plt.show()

Absolute Stability of the Forward Euler Method for complex zz

The region 1+z<1|1 + z| < 1 is the region of absolute stability. For more general problems λ\lambda may be complex (stemming from eigenvalues of real problems), so we generally consider the region of stability on the complex plane.

1+z<1| 1 + z | < 1

for zCz\in\mathbb{C}

# Plot the region of absolute stability for Forward Euler
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)

axes.fill(
    numpy.cos(t) - 1.0,
    numpy.sin(t),
    color=(255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0),
)
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
axes.set_xlim((-3, 3.0))
axes.set_ylim((-3, 3))
axes.set_aspect("equal")

axes.set_xlabel("Re(z)")
axes.set_ylabel("Im(z)")


axes.set_title("Absolute Stability Region for Forward Euler")
plt.show()

Using the stability plots to choose a stable Δt\Delta t

If you know the spectrum of eigenvalues λ\lambda for a given problem, you can plot them on the stability diagram and then find the value of Δt\Delta t such that all zi=λiΔtz_i = \lambda_i\Delta t are within the stability region.

E.g. consider Euler’s method on a single eigenvalue

λ=2+2i\lambda = -2 + 2i

with modulus λ=22|\lambda| = 2\sqrt{2} which plots outside of the stability region.

# Plot the region of absolute stability for Forward Euler
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

lam = (-2, 2)
t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)

axes.fill(
    numpy.cos(t) - 1.0,
    numpy.sin(t),
    color=(255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0),
)
axes.plot(lam[0], lam[1], "rx", markersize=8)
axes.plot((lam[0], 0.0), (lam[1], 0.0), "r--")
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
axes.set_xlim((-3, 3.0))
axes.set_ylim((-3, 3))
axes.set_aspect("equal")

axes.set_xlabel("Re(z)")
axes.set_ylabel("Im(z)")


axes.set_title("Absolute Stability Region for Forward Euler")
plt.show()

However z=λΔtz=\lambda\Delta t so

z=Δtλ|z|=\Delta t|\lambda|

Thus absolute stability requires that we find a time-step such that 1+z<1|1 + z| < 1. If we just look for the intersection with the boundaries of the stability region

1+z=1+Δt(2+2i)=1=(12Δt)2+4Δt2=1\begin{align} | 1 +z| &= | 1 +\Delta t(-2+2i) | = 1\\ & = \sqrt{ (1-2\Delta t)^2 + 4\Delta t^2} = 1 \\ \end{align}

implies (after a little work) that

Δt(2Δt1)=0\Delta t(2\Delta t - 1) = 0

or Δt(0,1/2)\Delta t \in (0, 1/2)

# Plot the region of absolute stability for Forward Euler
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

lam = (-2, 2)
t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)

axes.fill(
    numpy.cos(t) - 1.0,
    numpy.sin(t),
    color=(255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0),
)
axes.plot(lam[0], lam[1], "rx", markersize=8)
axes.text(lam[0], 1.1 * lam[1], "$\lambda$", fontsize=18)
axes.plot(0.5 * lam[0], 0.5 * lam[1], "kx", markersize=8)
axes.plot(0.5 * lam[0], 0.5 * lam[1], "kx", markersize=8)
axes.text(0.5 * lam[0], 0.6 * lam[1], "$\Delta t = 1/2$", fontsize=18)
axes.plot((lam[0], 0.0), (lam[1], 0.0), "r--")
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
axes.set_xlim((-3, 3.0))
axes.set_ylim((-3, 3))
axes.set_aspect("equal")

axes.set_xlabel("Re(z)")
axes.set_ylabel("Im(z)")


axes.set_title("Absolute Stability Region for Forward Euler")
plt.show()

Why complex eigenvalues? And why should we care?

A simple example: The simple harmonic oscillator

The linear harmonic oscillator can be written as a 2nd-order ODE for position xx of a mass on a spring

d2xdt2+x=0,x(0)=x0,dxdt=0.\frac{d^2 x}{dt^2} + x = 0, \quad x(0)=x_0,\, \frac{dx}{dt} = 0.

Which has solutions

x(t)=x0cos(t)x(t) = x_0\cos(t)

Alternatively, we can write this as a 2 dimensional system of first order equations by defining the velocity v=dx/dtv= dx/dt and $$

dxdt=vdvdt=x\begin{align} \frac{dx}{dt} &= v \\ \frac{dv}{dt} &= -x \\ \end{align}

$$

or more cleanly as

dudt=Au,u(0)=u0\frac{d\mathbf{u}}{dt} = A\mathbf{u},\quad\mathbf{u}(0)=\mathbf{u_0}

with

u=[xv]A=[0110]u0=[x00]\mathbf{u}=\begin{bmatrix} x\\ v\\\end{bmatrix}\quad A= \begin{bmatrix} 0 & 1 \\ -1 & 0 \\ \end{bmatrix}\quad\mathbf{u}_0=\begin{bmatrix} x_0\\ 0\\\end{bmatrix}

All diagonalizable linear dynamical systems have solutions of the form

u(t)=i=1ncieλitsi\mathbf{u}(t) = \sum_{i=1}^n c_i e^{\lambda_i t}\mathbf{s}_i

where λi\lambda_i, si\mathbf{s}_i are corresponding eigenvalues and eigenvectors of AA and each eigenvector satisfies

dsidt=λisi\frac{d\mathbf{s}_i}{dt} = \lambda_i\mathbf{s}_i

which is exactly our model problem.

In the case where

A=[0110]A= \begin{bmatrix} 0 & 1 \\ -1 & 0 \\ \end{bmatrix}

The eigenvalues are _ _ _ _ _ _ _ _ ?

Even with complex Eigenvalues and Eigenvectors, the solutions are real and can be shown to be perfect circles on the phase plane

x(t)=cos(t)v(t)=sin(t)\begin{align} x(t) &= \cos(t)\\ v(t) &=-\sin(t)\\ \end{align}

However, plotting the eigenvalues on the stability diagram show’s that Euler’s method will be completely unstable for this problem for all time steps Δt\Delta t

# Plot the region of absolute stability for Forward Euler
x = numpy.linspace(-2, 2, 11)
v = numpy.linspace(-2, 2, 11)
X, V = numpy.meshgrid(x, v)
t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)


fig = plt.figure(figsize=(16, 6))
axes = fig.add_subplot(1, 2, 1)
axes.quiver(X, V, V, -X)
axes.plot(1.5 * numpy.cos(t), -1.5 * numpy.sin(t))
axes.plot([-2, 2], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-2, 2], "k--")
axes.set_xlabel("x", fontsize=16)
axes.set_ylabel("v", fontsize=16)

axes.set_title("Phase Plane: simple Harmonic oscillator", fontsize=16)
axes.set_aspect("equal")

axes = fig.add_subplot(1, 2, 2)
axes.fill(
    numpy.cos(t) - 1.0,
    numpy.sin(t),
    color=(255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0),
)
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
axes.plot([0.0, 0.0], [-1.0, 1.0], "rx", markersize=10)
axes.set_xlim((-3, 3.0))
axes.set_ylim((-3, 3))
axes.set_aspect("equal")

axes.set_title("Absolute Stability Region for Forward Euler", fontsize=16)
plt.show()
f = lambda t, u: numpy.array([u[1], -u[0]])
t_span = [0.0, 4 * numpy.pi]
u_0 = numpy.array([1.0, 0.0])
N = 100
t, u = euler(f, t_span, u_0, N)
t, u = RK4(f, t_span, u_0, N)
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)
axes.quiver(X, V, V, -X)
axes.plot(1.0 * numpy.cos(t), -numpy.sin(t))
axes.plot(u[0], u[1], "r")
axes.plot([-2, 2], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-2, 2], "k--")
axes.set_xlabel("x", fontsize=16)
axes.set_ylabel("v", fontsize=16)

axes.set_title("Euler's method: simple Harmonic oscillator", fontsize=16)
axes.set_aspect("equal")

Absolute Stability of Single-Step Multi-Stage schemes -- The R method

This approach can be applied to all Single-Step Multi-Stage schemes including all the Runge-Kutta and Taylor Series methods. The recipe is straightforward

  • Apply the stepping scheme for one step of the model problem u=λuu' = \lambda u, u(0)=u0u(0)=u_0 assuming complex λ\lambda

  • This will result in a discrete approximation

U(Δt)=R(z)U0U(\Delta t) = R(z) U_0

where z=λΔtCz=\lambda\Delta t\in\mathbb{C}, which will be a discrete approximation to the true solution

U(Δt)=ezU0U(\Delta t) = e^{z} U_0
  • Absolute stability will require that R(z)<1|R(z)| < 1

Absolute Stability of the backward Euler Method

Now try this on backward Euler.

Un+1=Un+Δtf(tn+1,Un+1).U_{n+1} = U_n + \Delta t f(t_{n+1}, U_{n+1}).

which for our model problem is

Un+1=Un+ΔtλUn+1U_{n+1} = U_n + \Delta t\lambda U_{n+1}

or

(1Δtλ)Un+1=Un(1 - \Delta t\lambda) U_{n+1} = U_n

or rearranging and solving for Un+1U_{n+1} gives

Un+1=11ΔtλUn=11zUn\begin{align} U_{n+1} &= \frac{1}{1-\Delta t\lambda} U_n\\ &= \frac{1}{1-z}U_n\\ \end{align}

so

R(z)=11zR(z) = \frac{1}{1-z}

and absolute stability requires

R(z)11z1|R(z)| \leq 1 \leftrightarrow |1 - z| \geq 1

so in fact the stability region encompasses the entire complex plane except for a circle centered at (1,0)(1, 0) of radius 1 implying that the backward Euler method is in fact stable for any choice of Δt\Delta t for λ<0\lambda<0 (and even some positive values of λ\lambda.

# Plot the region of absolute stability for Backward Euler
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)

axes.set_facecolor((255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0))
axes.fill(numpy.cos(t) + 1.0, numpy.sin(t), "w")
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
# axes.set_xlim((-3, 1))
axes.set_ylim((-2, 2))
axes.set_aspect("equal")

axes.set_title("Absolute Stability Region for Backward Euler")
plt.show()

However, stability doesn’t imply accuracy either. In this case, backwards Euler will still fail after long times for the simple harmonic oscillator as well as it will damp towards zero. For the general Linear autonomous dynamical system

dudt=Au,u(0)=u0\frac{d\mathbf{u}}{dt} = A\mathbf{u}, \quad \mathbf{u}(0) = \mathbf{u}_0

Backwards Euler method can be written as

Un+1=Un+ΔtAUn+1\mathbf{U}_{n+1} = \mathbf{U}_n + \Delta t A\mathbf{U}_{n+1}

or rearranging, requires solving the linear algebraic problem at each step

(IΔtA)Un+1=Un(I - \Delta t A)\mathbf{U}_{n+1} = \mathbf{U}_n
# Implement Backwards Euler for a linear system $u' = Au$
def beuler(A, t_span, u0, N):
    """simple implementation of constant step-size forward euler method for vector valued f
    This doc string should have so much more in it
    """
    t = numpy.linspace(t_span[0], t_span[1], N)
    delta_t = t[1] - t[0]

    u = numpy.empty((len(t), len(u0)))
    B = numpy.eye(len(u0)) - delta_t * A
    u[0] = u0
    for n, t_n in enumerate(t[:-1]):
        u[n + 1] = numpy.linalg.solve(B, u[n])
    return t, u.transpose()
A = numpy.array([[0, 1], [-1, 0]])
t_span = [0.0, 4 * numpy.pi]
u_0 = numpy.array([1.0, 0.0])
N = 100
t, u = beuler(A, t_span, u_0, N)
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)
axes.quiver(X, V, V, -X)
axes.plot(1.0 * numpy.cos(t), -numpy.sin(t))
axes.plot(u[0], u[1], "r")
axes.plot([-2, 2], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-2, 2], "k--")
axes.set_xlabel("x", fontsize=16)
axes.set_ylabel("v", fontsize=16)

axes.set_title("Backward Euler: simple Harmonic oscillator", fontsize=16)
axes.set_aspect("equal")

General Stability Regions for Linear Methods

We can also map out stability regions for Linear Multi-step methods (although the analysis is slightly more complicated).

All of the Linear Multi-step methods from the previous lecture can be written

j=0rαjUn+j=Δtj=0rβjf(t,Un+j)\sum^r_{j=0} \alpha_j U_{n+j} = \Delta t \sum^r_{j=0} \beta_j f(t,U_{n+j})

which for our model problem, f(t,u)=λuf(t,u) = \lambda u becomes

j=0rαjUn+j=Δtj=0rβjλUn+j\sum^r_{j=0} \alpha_j U_{n+j} = \Delta t \sum^r_{j=0} \beta_j \lambda U_{n+j}

or rearranging and again defining z=Δtλz=\Delta t\lambda becomes

j=0r(αjβjz)Un+j=0.\sum^r_{j=0} (\alpha_j - \beta_j z) U_{n+j} = 0.

which takes the form of a “Linear Difference Equation”

Characteristic Polynomials and Linear Difference Equations

To help us analyze stability regions we need to take a small aside and consider linear difference equations and their solutions. Say we wanted to solve

j=0rαjUn+j=0\sum^r_{j=0} \alpha_j U_{n+j} = 0

given initial conditions U0,U1,,Ur1U_0, U_1, \ldots, U_{r-1}. This expression has a solution in the general form Un=ξnU_n = \xi^n.

Plugging this into the equation we have

j=0rαjξn+j=0\sum^r_{j=0} \alpha_j \xi^{n+j} = 0

which simplifies to

j=0rαjξj=0\sum^r_{j=0} \alpha_j \xi^j = 0

by dividing by ξn\xi^n. If ξ\xi then is a root of the polynomial

ρ(ξ)j=0rαjξj\rho(\xi) \equiv \sum^r_{j=0} \alpha_j \xi^j

then ξ\xi solves the original difference equation.

Example: the Fibonacci Numbers

A classic example of linear difference equation is the Fibonnacci sequence, 0,1,1,2,3,5,0,1,1,2,3,5,\ldots.

Which are solutions to the two-term difference equation

Un+2=Un+1+Un,U0=0,U1=1U_{n+2} = U_{n+1} + U_n, \quad U_0=0,\, U_1 = 1

or

Un+2Un+1Un=0U_{n+2} - U_{n+1} - U_n = 0

or α=[111]\alpha = \begin{bmatrix} -1 & -1 & 1 \\\end{bmatrix}

Substituting Un=ξnU_n = \xi^n into

Un+2Un+1Un=0U_{n+2} - U_{n+1} - U_n =0

yields

ξn(ξ2ξ1)=0\xi^n\left(\xi^2 - \xi -1\right) = 0

which has solutions ξ=1±52\xi = \frac{1\pm\sqrt{5}}{2}. Where the positive root is our favorite number ϕ=1.618033\phi=1.618033 which is the golden ratio. The negative root is 1ϕ1-\phi. Note, these roots are related to the eigenvalues of a related problem

Thus all Fibonacci numbers can be written as a linear combination of the two roots

Un=Aϕn+B(1ϕ)nU_n = A\phi^n + B(1-\phi)^n

where AA and BB are determined by the first two terms in the sequence [0,1].

Note, in this case ϕ>1\phi > 1 (while (1ϕ)<1|(1-\phi)|<1) thus UnU_n\rightarrow\infty as nn\rightarrow\infty.

Stability for LMM’s

Anyway, Back to the Linear difference equations arising from applying any Linear multistep method to the model problem

j=0r(αjβjz)Un+j=0.\sum^r_{j=0} (\alpha_j - \beta_j z) U_{n+j} = 0.

Letting

ρ(ξ)=j=0rαjξj,σ(ξ)=j=0rβjξj\rho(\xi) = \sum^r_{j=0} \alpha_j \xi^j, \quad \sigma(\xi) = \sum^r_{j=0} \beta_j \xi^j

and substituting Un=ξnU_n = \xi^n into the difference equations reduces to

π(ξ,z)=ρ(ξ)zσ(ξ)=0\pi(\xi, z) = \rho(\xi) - z \sigma(\xi) = 0

where π(ξ,z)\pi(\xi, z) is the stability polynomial of the method, whose roots are solutions of the difference method.

Absolute stability of the linear multi-step methods

For UnU_n to stay bounded as nn\rightarrow\infty, it requires that all roots ξi\xi_i of π(ξ,z)\pi(\xi, z) satisfy

ξi1|\xi_i| \leq 1

If this is the case, then the multi-step method is zero-stable. Roughly you can think of these roots as multiplying the past truncation errors so that if they are less than or equal to 1 they do not grow in magnitude. We then define the region of absolute stability as the values for zz for which this is true. This approach reduces to the RR method for one-step methods as a special case.

Example: Forward Euler’s Method

Examining forward Euler’s method we have

0=Un+1UnΔtλUn=Un+1Un(1+Δtλ)\begin{aligned} 0 &= U_{n+1} - U_n - \Delta t \lambda U_n \\ &= U_{n+1} - U_n (1 + \Delta t \lambda)\\ \end{aligned}

setting Un=ξnU_n = \xi^n implies

0=ξn(ξ(1+z))\begin{aligned} 0 &= \xi^n\left(\xi - (1 + z)\right)\\ \end{aligned}

or π(ξ,z)=ξ(1+z)\pi(\xi, z) = \xi - (1 + z) whose root is ξ=1+z\xi = 1 + z and we have re-derived the stability region we had found before.

Example: Backwards Euler’s Method

Similarly for backwards Euler

0=Un+1UnΔtλUn+1=Un+1(1Δtλ)Un=ξ(1z)1=π(ξ,z)\begin{aligned} 0 &= U_{n+1} - U_n - \Delta t \lambda U_{n+1} \\ &= U_{n+1}(1 - \Delta t\lambda) - U_n \\ &= \xi(1 - z) - 1 \\ &=\pi(\xi, z) \end{aligned}

whose root is ξ=1/(1z)\xi = 1/(1 - z)

Example: Adams-Bashforth 2

As an example consider the Adams-Bashforth 2-stage LMM

Un+2=Un+1+Δt2(f(Un)+3f(Un+1))U_{n+2} = U_{n+1} + \frac{\Delta t}{2} (-f(U_n) + 3 f(U_{n+1}))

Let f(U)=λUf(U) = \lambda U, z=Δtλz = \Delta t \lambda

Un+2Un+112(zUn+3zUn+1)=0U_{n+2} - U_{n+1} - \frac{1}{2} (-z U_n + 3 z U_{n+1}) = 0 \\

substituting Un=ξnU_n=\xi^n, and factoring out ξn\xi^n yields

π(ξ,z)=ξ2ξ12(z+3zξ)\pi(\xi, z) = \xi^{2} - \xi - \frac{1}{2} (-z + 3 z \xi)

with roots when

2ξ2(2+3z)ξ+z=02 \xi^2 - (2 + 3z) \xi + z =0

Plotting Stability Regions

Given a stability polynomial π(ξ,z)\pi(\xi,z), absolute stability requires that

maxξi<=1\max{|\xi_i|} <= 1

where ξi\xi_i are the (generally complex) roots of π(ξ,z)\pi(\xi, z) and therefore a function of zz. In the case of single step schemes, the problem reduces to finding the region of the complex plane where R(z)<1|R(z)| < 1.

Either way, the simplest way to visualize the regions of the complex plane where the scheme is stable, is to calculate either R(z)|R(z)| or maxξi(z)\max|\xi_i(z)| and plot the 1-contour and determine all regions that are <1<1.

Some useful codes

study the following cells for some useful routines for plotting stability regions for both Single-step multi-stage schemes and Linear Multi-step schemes

def stability_plot(X, Y, C, axes, title=None, continuous=True):
    """
    Utility function to make stability diagram given complex stability scalar C

    parameters:
    -----------

    X, Y:  numpy.meshgrids for complex plane
    C: numpy array
        Field to plot,  either |R(z)| for a single step scheme, or max(|xi_i(z)|) for a LMM scheme
    axes: matplotlib axes object
        subplot or plot to draw in.
    title: string
        subplot title if not None
    continuous: boolean
        if True, plot a continous coloring of C
        if False, plot Heaviside(C)
    """
    if continuous:
        Ch = C
    else:
        Ch = numpy.heaviside(C - 1, 0.0)
    pcolor_plot = axes.pcolor(
        X, Y, Ch, vmin=0, vmax=1, cmap=plt.get_cmap("Greens_r"), shading="auto"
    )
    axes.contour(X, Y, C, "k", levels=[1.0])
    fig = plt.gcf()
    fig.colorbar(pcolor_plot)
    axes.plot(x, numpy.zeros(x.shape), "k--")
    axes.plot(numpy.zeros(y.shape), y, "k--")

    axes.set_xlabel("Re", fontsize=16)
    axes.set_ylabel("Im", fontsize=16)
    if title is not None:
        axes.set_title(title, fontsize=16)

    axes.set_aspect("equal")


def plot_stability_ssms(R, x, y, axes=None, title=None, continuous=True):
    """
    plot stability regions for single-step multi-stage ODE schemes given the function R(z)
    such that U_{n+1} = R(z)U_n   and z is complex

    parameters:
    -----------

    R: calleable
        function of a complex variable z such that if |R|<=1, the scheme is absolutely stable
    x: numpy array
        array of values for the real axis
    y: numpy array
        values to plot for the imaginary axis
    axes: matplotlib axes object
        subplot or plot to draw in.  If axes=None create a new figure
    title: string
        subplot title if
    continuous: boolean
        if True, plot a continous coloring of C
        if False, plot Heaviside(C)
    """

    X, Y = numpy.meshgrid(x, y)
    Z = X + 1j * Y
    if axes is None:
        fig = plt.figure(figsize=(8, 6))
        axes = fig.add_subplot(1, 1, 1)

    abs_R = numpy.abs(R(Z))
    stability_plot(X, Y, abs_R, axes, title, continuous)


def plot_stability_lmm(pi_coeff, x, y, axes=None, title=None, continuous=True):
    """
    plot stability regions for linear multi-step  ODE schemes given the coefficients of the stability polynomial
    pi(xi, z)

    parameters:
    -----------

    pi_coeff: calleable (function of z)
        function that returns array of stability polynomial pi(z)
    x: numpy array
        array of values for the real axis
    y: numpy array
        values to plot for the imaginary axis
    axes: matplotlib axes object
        subplot or plot to draw in.  If axes=None create a new figure
    title: string
        subplot title if not None
    continuous: boolean
        if True, plot a continous coloring of C
        if False, plot Heaviside(C)
    """

    X, Y = numpy.meshgrid(x, y)
    Z = X + 1j * Y
    if axes is None:
        fig = plt.figure(figsize=(8, 6))
        axes = fig.add_subplot(1, 1, 1)

    norm_max = numpy.empty(Z.shape)
    for i, row in enumerate(Z):
        for j, z in enumerate(row):
            norm_max[i, j] = max(numpy.abs(numpy.roots(pi_coeff(z))))

    stability_plot(X, Y, norm_max, axes, title, continuous)
x = numpy.linspace(-3, 3, 100)
y = numpy.linspace(-3, 3, 100)

R_euler = lambda z: 1 + z
R_beuler = lambda z: 1.0 / (1.0 - z)
R_trap = lambda z: (1.0 + 0.5 * z) / (1.0 - 0.5 * z)
R_RK2 = lambda z: 1 + z + z**2 / 2
R_Taylor4 = lambda z: 1 + z + z**2 / 2 + z**3 / 6.0 + z**4 / 24.0

continuous = True
fig = plt.figure(figsize=(12, 12))
axes = fig.add_subplot(2, 2, 1)
plot_stability_ssms(
    R_euler, x, y, axes=axes, title="Stability of Eulers method", continuous=continuous
)

axes = fig.add_subplot(2, 2, 2)
plot_stability_ssms(
    R_beuler,
    x,
    y,
    axes=axes,
    title="Stability of Backwards Eulers",
    continuous=continuous,
)

axes = fig.add_subplot(2, 2, 3)
plot_stability_ssms(
    R_trap,
    x,
    y,
    axes=axes,
    title="Stability of Trapezoidal method",
    continuous=continuous,
)

axes = fig.add_subplot(2, 2, 4)
plot_stability_ssms(
    R_Taylor4,
    x,
    y,
    axes=axes,
    title="Stability of Taylor4 method",
    continuous=continuous,
)
x = numpy.linspace(-3, 3, 100)
y = numpy.linspace(-3, 3, 100)

pi_euler = lambda z: numpy.array([1.0, -(1 + z)])
pi_AB2 = lambda z: numpy.array([2, -(2.0 + 3 * z), z])
fig = plt.figure(figsize=(12, 6))
axes = fig.add_subplot(1, 2, 1)
plot_stability_lmm(pi_euler, x, y, axes=axes, title="Stability of Eulers method")
axes = fig.add_subplot(1, 2, 2)
plot_stability_lmm(
    pi_AB2, x, y, axes=axes, title="Stability of Adams-Bashforth2 method"
)

Application to Stiff ODEs

Consider the ODE IVP

u(t)=λ(ucost)sint,u(t0)=u0u'(t) = \lambda (u - \cos t) - \sin t,\quad u(t_0)=u_0

The general solution of the ODE is

u(t)=eλ(tt0)(u0cost0)+costu(t) = e^{\lambda (t - t_0)} (u_0 - \cos t_0) + \cos t

For u0=1u_0=1 at t0=0t_0=0, the solution is simply cost\cos t

What happens to solutions that are slightly different from u0=1u_0 = 1 or t0=0t_0 = 0?

u(t)=eλ(tt0)(u0cost0)+costu(t) = e^{\lambda (t - t_0)} (u_0 - \cos t_0) + \cos t
# Plot "hairy" solutions to the ODE
u = lambda t_0, u0, lam, t: numpy.exp(lam * (t - t_0)) * (
    u0 - numpy.cos(t_0)
) + numpy.cos(t)

fig = plt.figure(figsize=(12, 10))

for i, lam in enumerate([-1, -10, 1, 10]):
    axes = fig.add_subplot(2, 2, i + 1)
    for u0 in numpy.linspace(-1, 1, 10):
        for t_0 in numpy.linspace(0.0, 9.0, 10):
            t = numpy.linspace(t_0, 10.0, 100)
            axes.plot(t, u(t_0, u0, lam, t), "b")
    t = numpy.linspace(0.0, 10.0, 100)
    axes.plot(t, numpy.cos(t), "r", linewidth=5)
    axes.set_ylim((-2, 2))

    axes.set_title("Perturbed Solutions $\lambda = %s$" % lam)
    axes.set_xlabel("$t$")
    axes.set_ylabel("$u(t)$")
axes.set_ylim((-2, 2))
plt.show()
# Plot "inverse hairy" solutions to the ODE
u = lambda t_0, u0, lam, t: numpy.exp(lam * (t - t_0)) * (
    u0 - numpy.cos(t_0)
) + numpy.cos(t)

num_steps = 10
error = numpy.ones(num_steps) * 1.0
t_hat = numpy.linspace(0.0, 10.0, num_steps + 1)
t_whole = numpy.linspace(0.0, 10.0, 1000)

fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
eta = 1.0
lam = 0.1

for n in range(1, num_steps):
    t = numpy.linspace(t_hat[n - 1], t_hat[n], 100)
    U = u(t_hat[n - 1], eta, lam, t)
    axes.plot(t, U, "b")
    axes.plot(t_whole, u(t_hat[n - 1], eta, lam, t_whole), "b--")
    axes.plot([t[-1], t[-1]], (U[-1], U[-1] + -(1.0**n) * error[n]), "r")
    eta = U[-1] + -(1.0**n) * error[n]

t = numpy.linspace(0.0, 10.0, 100)
axes.plot(t, numpy.cos(t), "g")

axes.set_title("Perturbed Solutions $\lambda = %s$" % lam)
axes.set_xlabel("$t$")
axes.set_ylabel("$u(t)$")
axes.set_ylim((-10, 10))
plt.show()

Example: Chemical systems

Consider the transition of a chemical AA to a chemical CC through the process

AK1BK2C.A \overset{K_1}{\rightarrow} B \overset{K_2}{\rightarrow} C.

We can write this problem as a system of Linear ODE’s $$

dAdt=K1AdBdt=K1AK2BdCdt=K2B\begin{align*} \frac{dA}{dt} &= -K_1 A\\ \frac{dB}{dt} &= K_1 A -K_2B\\ \frac{dC}{dt} &= K_2B\\ \end{align*}

$withinitialcondition with initial condition A_0, B_0, C_0$

If we let

u=[ABC]\mathbf{u} = \begin{bmatrix} A \\ B \\ C \end{bmatrix}

then we can rewrite this as a linear dynamical system

dudt=Mu,u(0)=u0\frac{d\mathbf{u}}{dt} = M\mathbf{u}, \quad \mathbf{u}(0) = \mathbf{u}_0

where

M=[K100K1K200K20]M = \begin{bmatrix} -K_1 & 0 & 0 \\ K_1 & -K_2 & 0 \\ 0 & K_2 & 0 \end{bmatrix}

If AA is diagonalizable, the general solution of a linear dynamical system can be written in terms of the eigenvalues (λi\lambda_i) and eigenvectors xi\mathbf{x}_i of MM as

u(t)=c1eλ1tx1+c2eλ2tx2+c3eλ3tx3\mathbf{u}(t) = c_{1} e^{\lambda_1 t}\mathbf{x}_1 + c_{2}e^{\lambda_2t}\mathbf{x}_2 + c_{3}e^{\lambda_3t}\mathbf{x}_3

And given our matrix $$ M =

[K100K1K200K20]\begin{bmatrix} -K_1 & 0 & 0 \\ K_1 & -K_2 & 0 \\ 0 & K_2 & 0 \end{bmatrix}

$$

The eigenvalues are ??

And the general solution is

u(t)=c1eK1tx1+c2eK2tx2+c3x3\mathbf{u}(t) = c_{1} e^{-K_1t}\mathbf{x}_1 + c_{2}e^{-K_2t}\mathbf{x}_2 + c_{3}\mathbf{x}_3

and the constants cic_i can be found by solving Xc=u0X\mathbf{c} = \mathbf{u}_0 which describe the initial condition as a linear combination of the eigenvectors.

# Solve the chemical systems example using a Forward Euler scheme
# Problem parameters
K_1 = 1
K_2 = 1000
z = -2.01
dt = z / (-K_2)
T_max = 5.0
N = int(T_max / dt)
print("dt = {}, N = {}".format(dt, N))


A = numpy.array([[-K_1, 0, 0], [K_1, -K_2, 0], [0, K_2, 0]])
f = lambda u: numpy.dot(A, u)
u_0 = [3, 0, 0.0]
t_span = [0, T_max]
f = lambda t, u: A.dot(u)
t, U = euler(f, t_span, u_0, N)

fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

axes.plot(t, U.transpose())
axes.legend(["[A]", "[B]", "[C]"])
axes.set_title("Chemical System")
axes.set_xlabel("$t$")
axes.set_title("$[A], [B], [C]$")
# axes.set_ylim((0.0, 10.))
# axes.set_xlim((0.0, 2.0))
axes.grid()

plt.show()

What is stiffness?

In general a stiff ODE is one where u(t)f(t,u)u'(t) \ll f'(t, u). For linear dynamical systems like the chemical decay problem, the stiffness ratio

maxpλpminpλp\frac{\max_p |\lambda_p|}{\min_p |\lambda_p|}

can be used to characterize the stiffness of the system. In our last example this ratio was K2/K1K_2 / K_1 if K2>K1K_2 > K_1. As we increased this ratio we observed that the numerical method became unstable and only a reduction in Δt\Delta t provided a stable solution again. For explicit time stepping methods this is problematic as the reduction of the time step for only one of the species leads to very expensive evaluations. For example, forward Euler has the stability criteria

1+Δtλ<1|1 + \Delta t \lambda| < 1

where λ\lambda will have to be the maximum eigenvalue of the system.

# Implement vectorized Forward Euler
def euler(f, t_span, u0, N):
    """simple implementation of constant step-size forward euler method for vector valued f
    This doc string should have so much more in it
    """
    t = numpy.linspace(t_span[0], t_span[1], N)
    if numpy.isscalar(u0):
        u = numpy.empty(t.shape)
    else:
        u = numpy.empty((len(t), len(u0)))
    u[0] = u0
    delta_t = t[1] - t[0]
    for n, t_n in enumerate(t[:-1]):
        K1 = delta_t * f(t_n, u[n])
        u[n + 1] = u[n] + K1
    return t, u.transpose()
# set the eigenvalues and time step
K_1 = 1.0
K_2 = 3.0
delta_t = 1.0
eigenvalues = [-K_1, -K_2]
# Plot the region of absolute stability for Forward Euler
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)


axes.fill(
    numpy.cos(t) - 1.0,
    numpy.sin(t),
    color=(255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0),
)
for lam in eigenvalues:
    print("z = {}".format(lam * delta_t))
    axes.plot(lam * delta_t, 0.0, "ko")
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
# axes.set_xlim((-3, 1))
axes.set_ylim((-2, 2))
axes.set_aspect("equal")

axes.set_title("Absolute Stability Region for Forward Euler")
plt.show()
# solve using backwards Euler
K_1 = 1.0
K_2 = 300
A = numpy.array([[-K_1, 0, 0], [K_1, -K_2, 0], [0, K_2, 0]])

N = 50
t_span = [0.0, 5.0]
U0 = [3.0, 0.0, 0.0]

t, U = beuler(A, t_span, U0, N)
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

axes.plot(t, U.T)
axes.legend(["[A]", "[B]", "[C]"])
axes.set_title("Chemical System")
axes.set_xlabel("$t$")
axes.set_title("$[A], [B], [C]$")
axes.grid()

plt.show()
from scipy.integrate import solve_ivp

t_span = [0.0, 5.0]
U0 = [3.0, 0.0, 0.0]
K_1 = 1.0
K_2 = 10000
A = numpy.array([[-K_1, 0, 0], [K_1, -K_2, 0], [0, K_2, 0]])
f = lambda t, u: numpy.dot(A, u)

sol = solve_ivp(f, t_span, U0, method="Radau")
fig = plt.figure(figsize=(8, 6))
axes = fig.add_subplot(1, 1, 1)

axes.plot(sol.t, sol.y.T, "o-")
axes.legend(["[A]", "[B]", "[C]"])
axes.set_title("Chemical System")
axes.set_xlabel("$t$")
axes.set_title("$[A], [B], [C]: N={}$".format(len(sol.t)))
axes.grid()

plt.show()

A-Stability

What if we could expand the absolute stability region to encompass more of the left-half plane or even better, all of it. A method that has this property is called A-stable. We have already seen one example of this with backward Euler which as a stability region of

1z1|1 - z| \geq 1

which covers the full left-half plane. It turns out that for linear multi-step methods a theorem by Dahlquist proves that there are no LMMs that satisfies the A-stability criterion beyond second order (trapezoidal rule). There are higher-order Runge-Kutta methods do however.

Perhaps this is too restrictive though. Often large eigenvalues for systems (for instance coming from a PDE discretization for the heat equation) lie completely on the real line. If the stability region can encompass as much of the real line as possible while leaving out the rest of the left-half plane we can possibly get a more efficient method. There are a number of methods that can be constructed that have this property but are higher-order.

# Plot the region of absolute stability for Backward Euler
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)

t = numpy.linspace(0.0, 2.0 * numpy.pi, 100)

K_1 = 3.0
K_2 = 1.0
delta_t = 1.0
eigenvalues = [-K_1, -K_2]

axes.set_facecolor((255.0 / 255.0, 145.0 / 255.0, 0 / 255.0, 1.0))
axes.fill(numpy.cos(t) + 1.0, numpy.sin(t), "w")
for lam in eigenvalues:
    print(lam * delta_t)
    axes.plot(lam * delta_t, 0.0, "ko")
axes.plot([-3, 3], [0.0, 0.0], "k--")
axes.plot([0.0, 0.0], [-3, 3], "k--")
# axes.set_xlim((-3, 1))
axes.set_ylim((-2, 2))
axes.set_aspect("equal")

axes.set_title("Absolute Stability Region for Backward Euler")
plt.show()

L-Stability

It turns out not all A-stable methods are alike. Consider the backward Euler method and the trapezoidal method defined by

Un+1UnΔt=12(f(Un)+f(Un+1))\frac{U_{n+1} - U_n}{\Delta t} = \frac{1}{2}(f(U_n) + f(U_{n+1}))

whose stability polynomial (or R(z))R(z)) is

0=Un+1UnΔt12(λUn+λUn+1)=Un+1(112Δtλ)Un(1+12Δtλ)\begin{aligned} 0 &= U_{n+1} - U_n - \Delta t \frac{1}{2} (\lambda U_n + \lambda U_{n+1}) \\ &= U_{n+1}\left(1 - \frac{1}{2} \Delta t \lambda \right ) - U_n \left(1 + \frac{1}{2}\Delta t \lambda \right)\\ \end{aligned}

or

π(ξ,z)=(1z2)ξ(1+z2)\pi(\xi,z) = \left(1 - \frac{z}{2}\right ) \xi - \left(1 + \frac{z}{2}\right)

with single root

ξ(z)=R(z)=1+z21z2\xi(z) = R(z) = \frac{1 + \frac{z}{2}}{1 - \frac{z}{2}}

which shows that it is A-stable (i.e. R(z)<1|R(z)| < 1 for all (z)<0\Re(z)<0)

Example

Let’s apply both these methods the stiff problem

u(t)=λ(ucost)sint,u(t0)=u0u'(t) = \lambda (u - \cos t) - \sin t,\quad u(t_0)=u_0

With solution

u(t)=eλ(tt0)(u0cost0)+costu(t) = e^{\lambda (t - t_0)} (u_0 - \cos t_0) + \cos t

and see what happens.

# Plot "hairy" solutions to the ODE
u = lambda t_0, u0, lam, t: numpy.exp(lam * (t - t_0)) * (
    u0 - numpy.cos(t_0)
) + numpy.cos(t)

fig = plt.figure(figsize=(12, 5))

for i, lam in enumerate([-2, -20]):
    axes = fig.add_subplot(1, 2, i + 1)
    for u0 in numpy.linspace(-1, 1, 10):
        for t_0 in numpy.linspace(0.0, 9.0, 10):
            t = numpy.linspace(t_0, 10.0, 100)
            axes.plot(t, u(t_0, u0, lam, t), "b")
    t = numpy.linspace(0.0, 10.0, 100)
    axes.plot(t, numpy.cos(t), "r", linewidth=5)
    axes.set_ylim((-1.5, 1.5))
    axes.grid()

    axes.set_title("Perturbed Solutions $\lambda = %s$" % lam)
    axes.set_xlabel("$t$")
    axes.set_ylabel("$u(t)$")
axes.set_ylim((-1.5, 1.5))
plt.show()
# Compare accuracy between Euler and Trapezoidal for the stiff ODE
f = lambda t, lam, u: lam * (u - numpy.cos(t)) - numpy.sin(t)
u_exact = lambda t_0, eta, lam, t: numpy.exp(lam * (t - t_0)) * (
    u0 - numpy.cos(t_0)
) + numpy.cos(t)

t_0 = 0.0
t_f = 1.0
eta = 2.0
lam = -1e3
num_steps = [10, 20, 40, 50]
# num_steps = numpy.arange(100, 1000, 100)

delta_t = numpy.empty(len(num_steps))
error_euler = numpy.empty(len(num_steps))
error_trap = numpy.empty(len(num_steps))

for i, N in enumerate(num_steps):
    t = numpy.linspace(0, t_f, N)
    delta_t[i] = t[1] - t[0]
    u = u_exact(t_0, eta, lam, t_f)

    # Compute Euler solution
    U_euler = numpy.empty(t.shape)
    U_euler[0] = eta
    for n, t_n in enumerate(t[1:]):
        U_euler[n + 1] = (
            U_euler[n] - lam * delta_t[i] * numpy.cos(t_n) - delta_t[i] * numpy.sin(t_n)
        ) / (1.0 - lam * delta_t[i])
    error_euler[i] = numpy.abs(U_euler[-1] - u) / numpy.abs(u)

    # Compute using trapezoidal
    U_trap = numpy.empty(t.shape)
    U_trap[0] = eta
    for n, t_n in enumerate(t[1:]):
        U_trap[n + 1] = (
            U_trap[n]
            + delta_t[i] * 0.5 * f(t_n, lam, U_trap[n])
            - 0.5 * lam * delta_t[i] * numpy.cos(t_n)
            - 0.5 * delta_t[i] * numpy.sin(t_n)
        ) / (1.0 - 0.5 * lam * delta_t[i])
    error_trap[i] = numpy.abs(U_trap[-1] - u)
# Plot error vs. delta_t
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
axes = fig.add_subplot(1, 2, 1)
axes.plot(t, U_euler, "ro-")
axes.plot(t, u_exact(t_0, eta, lam, t), "k")

axes = fig.add_subplot(1, 2, 2)
axes.loglog(delta_t, error_euler, "bo")
order_C = lambda delta_x, error, order: numpy.exp(
    numpy.log(error) - order * numpy.log(delta_x)
)
axes.loglog(
    delta_t,
    order_C(delta_t[1], error_euler[1], 1.0) * delta_t**1.0,
    "r--",
    label="1st Order",
)
axes.loglog(
    delta_t,
    order_C(delta_t[1], error_euler[1], 2.0) * delta_t**2.0,
    "b--",
    label="2nd Order",
)

axes.legend(loc=4)
axes.set_title("Comparison of Errors for Backwards Euler")
axes.set_xlabel("$\Delta t$")
axes.set_ylabel("$|U(t_f) - u(t_f)|$")

# Plots for trapezoid
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
axes = fig.add_subplot(1, 2, 1)
axes.plot(t, U_trap, "ro-")
axes.plot(t, u_exact(t_0, eta, lam, t), "k")

axes = fig.add_subplot(1, 2, 2)
axes.loglog(delta_t, error_trap, "bo", label="Trapezoidal")
order_C = lambda delta_x, error, order: numpy.exp(
    numpy.log(error) - order * numpy.log(delta_x)
)
axes.loglog(
    delta_t,
    order_C(delta_t[1], error_trap[1], 1.0) * delta_t**1.0,
    "r--",
    label="1st Order",
)
axes.loglog(
    delta_t,
    order_C(delta_t[1], error_trap[1], 2.0) * delta_t**2.0,
    "b--",
    label="2nd Order",
)

axes.legend(loc=4)
axes.set_title("Comparison of Errors for Trapezoidal Rule")
axes.set_xlabel("$\Delta t$")
axes.set_ylabel("$|U(t_f) - u(t_f)|$")

plt.show()

It turns out that if we look at a one-step method and define the following ratio

Un+1=R(z)UnU_{n+1} = R(z) U_n

we can define another form of stability, called L-stable, where we require that the method is A-stable and that

limzR(z)=0.\lim_{z \rightarrow \infty} |R(z)| = 0.

Turns out that backwards Euler is L-stable while trapezoidal rule is not.

R_beuler = lambda z: 1.0 / (1.0 - z)
R_trap = lambda z: (1.0 + 0.5 * z) / (1.0 - 0.5 * z)
fig = plt.figure(figsize=(12, 6))
axes = fig.add_subplot(1, 2, 1)
plot_stability_ssms(R_beuler, x, y, axes=axes, title="Stability of Backwards Eulers")

axes = fig.add_subplot(1, 2, 2)
plot_stability_ssms(R_trap, x, y, axes=axes, title="Stability of Trapezoidal")

Backward Differencing Formulas

A class of LMM methods that are useful for stiff ODE problems are the backward difference formula (BDF) methods which have the form

α0Un+α1Un+1++αrUn+r=Δtβrf(Un+r)\alpha_0 U_n + \alpha_1 U_{n+1} + \cdots + \alpha_r U_{n+r} = \Delta t\beta_r f(U_{n+r})

These methods can be derived directly from backwards finite differences from the point Un+rU_{n+r} and the rest of the points back in time. One can then derive r-step methods that are rth-order accurate this way. Some of the methods are

BDF-1 (r=1):Un+1Un=Δtf(Un+1)BDF-2 (r=2):3Un+24Un+1+Un=2Δtf(Un+2)BDF-3 (r=3):11Un+318Un+2+9Un+12Un=6Δtf(Un+3)BDF-4 (r=4):25Un+448Un+3+36Un+216Un+1+3Un=12Δtf(Un+4)\begin{aligned} \text{BDF-1}~& (r = 1):& & U_{n+1} - U_n = \Delta t f(U_{n+1}) \\ \text{BDF-2}~& (r = 2):& &3 U_{n+2} - 4 U_{n+1} + U_n = 2 \Delta t f(U_{n+2}) \\ \text{BDF-3}~& (r = 3):& &11U_{n+3} - 18U_{n+2} + 9U_{n+1} - 2 U_n = 6 \Delta t f(U_{n+3}) \\ \text{BDF-4}~& (r = 4):& &25 U_{n+4} - 48 U_{n+3} +36 U_{n+2} -16 U_{n+1} +3 U_n = 12 \Delta t f(U_{n+4}) \end{aligned}
  • Inspection of BDF-1 shows that it’s equivalent to ??

  • You should also recognize BDF-2 from our Numerical differentiation notes\ldots

from fdcoeffV import fdcoeffV

# order of the BDF scheme (i.e. N=2 implies BDF-2)
N = 3
# order of the derivative
k = 1
x = numpy.array(range(N + 1))
w = fdcoeffV(k, x[N], x)
print("weights = {}".format(w))
print("weights x 6 = {}".format(6 * w))
## Finish the code off here for calculating BDF coefficients
pi_BDF1 = lambda z: numpy.array([1.0 - z, -1.0])
pi_BDF2 = lambda z: numpy.array([3.0 - 2 * z, -4.0, 1.0])
pi_BDF3 = lambda z: numpy.array([11 - 6.0 * z, -18.0, 9.0, -2.0])
pi_BDF4 = lambda z: numpy.array([25.0 - 12 * z, -48.0, 36, -16.0, 3])
x = numpy.linspace(-10, 10, 50)
y = numpy.linspace(-10, 10, 50)
fig = plt.figure(figsize=(12, 12))
axes = fig.add_subplot(2, 2, 1)
plot_stability_lmm(pi_BDF1, x, y, axes=axes, title="Stability of BDF1 method")

axes = fig.add_subplot(2, 2, 2)
plot_stability_lmm(pi_BDF2, x, y, axes=axes, title="Stability of BDF2 method")

axes = fig.add_subplot(2, 2, 3)
plot_stability_lmm(pi_BDF3, x, y, axes=axes, title="Stability of BDF3 method")

axes = fig.add_subplot(2, 2, 4)
plot_stability_lmm(pi_BDF4, x, y, axes=axes, title="Stability of BDF4 method")

BDF Schemes

Pro’s

  • LMM’s so only require 1 (non-linear) solve per time step

  • relatively easy to derive and code methods of variable order

  • L-Stable and A-Stable

Con’s

  • Still need to deal solving for system’s of non-linear equations

  • LMM’s are harder to make adaptive (but can do so using interpolation techniques)

Example, BDF-1 for a general non-linear ODE

Consider the general nn-dimensional nonlinear system of ODE’s (e.g van-der pol oscillator or SIR problem) which can be written as

dudt=f(t,u),u(0)=u0\frac{d\mathbf{u}}{dt} = \mathbf{f}(t,\mathbf{u}), \quad \mathbf{u}(0) = \mathbf{u}_0

where

u(t)=[u1(t)u2(t)un(t)],f(t,u)=[f1(t,u)f2(t,u)fn(t,u)]\mathbf{u}(t) = \begin{bmatrix} u_1(t) \\ u_2(t) \\ \vdots \\ u_n(t) \end{bmatrix},\quad \mathbf{f}(t,\mathbf{u}) = \begin{bmatrix} f_1(t,\mathbf{u}) \\ f_2(t,\mathbf{u}) \\ \vdots \\ f_n(t,\mathbf{u}) \end{bmatrix}

and f\mathbf{f} is a vector of scalar, possibly non-linear functions of the solution vector u\mathbf{u}

for example, in the Van-der-pol oscillator n=2n=2 and

f(t,u)=[u2μ(1u12)u2u1]\mathbf{f}(t,\mathbf{u}) = \begin{bmatrix} u_2 \\ \mu(1 - u_1^2)u_2 - u_1 \end{bmatrix}

BDF-1 discretization (Backwards-Euler)

So the BDF-1 (Backwards Euler) discretization of this system can be written

Un+1UnΔtf(t,Un+1)=0\mathbf{U}_{n+1} - \mathbf{U}_{n} - \Delta t\mathbf{f}(t,\mathbf{U}_{n+1}) = \mathbf{0}

Which itself is a more general system of non-linear Equations for Un+1\mathbf{U}_{n+1} of form

F(Un+1)=[F1(t,Un+1,Un)F2(t,Un+1,Un)Fn(t,Un+1,Un)]=0\mathbf{F}(\mathbf{U}_{n+1}) = \begin{bmatrix} F_1(t,\mathbf{U}_{n+1},\mathbf{U}_n) \\ F_2(t,\mathbf{U}_{n+1},\mathbf{U}_n) \\ \vdots\\ F_n(t,\mathbf{U}_{n+1},\mathbf{U}_n)\\ \end{bmatrix} = \mathbf{0}

where, in this case

Fi=Ui(n+1)Ui(n)Δtfi(t,Un+1)F_i = {U_{i}}_{(n+1)} - {U_{i}}_{(n)} - \Delta tf_i(t,\mathbf{U}_{n+1})

Solution of systems of Non-linear equations: Newton’s method

If our system had only one scalar variable and function f(t,u)f(t,u), then each step would reduce to solving the scalar non-linear problem

F(un+1)=0F(u_{n+1}) = 0

which is a rootfinding problem, and we could use any of our rootfinders to solve this (in particular we could use robust bracketing schemes like brent). Unfortunately once we move to higher-dimensions the problems become significantly harder (and solutions may not exist). However we can still attempt to apply Newton’s method to the general non-linear problem

F(x)=0\mathbf{F}(\mathbf{x}) = \mathbf{0}

Newton’s method: preliminaries

As a simpler non-linear system, we will consider the system of equations

x12+2x22=γ1x1+5x2=γ2\begin{align} x_1^2 + 2x_2^2 &=\gamma_1 \\ x_1 + 5x_2 &=\gamma_2 \\ \end{align}

which geometrically is the intersection of an ellipse and a straight line, which, depending on the values of γ1\gamma_1 and γ2\gamma_2 should have 0,1, or 2 solutions

x = numpy.linspace(-4, 4, 100)
y = numpy.linspace(-4, 4, 100)
X, Y = numpy.meshgrid(x, y)

Z1 = lambda gamma: X**2 + 2 * Y**2 - gamma
Z2 = lambda gamma: X + 5 * Y - gamma

gamma_1 = 10 * numpy.ones(3)
gamma_2 = [-2, 10, -15]
fig = plt.figure(figsize=(24, 6))

for i in range(3):
    axes = fig.add_subplot(1, 3, i + 1)
    axes.contour(X, Y, Z1(gamma_1[i]), [0], colors="r")
    axes.contour(X, Y, Z2(gamma_2[i]), [0], colors="b")
    axes.axis("equal")

    axes.set_xlabel("$x_1$")
    axes.set_ylabel("$x_2$")
    axes.set_title("$\gamma_1={}, \gamma_2={}$".format(gamma_1[i], gamma_2[i]))
    axes.grid()

plt.show()

Now this system is sufficiently simple that we could solve it using substitution and solve a quadratic equation for the roots (or even use a 1-D rootfinder). But in general we would like a more general method where we can start from some initial guess and rapidly converge to one of the roots. Newton’s method provides a general way to solve these sorts of problems (with all the issues that go with Newton for scalar problems)

Newton’s method: Review f(x)=0f(x)=0

Newton’s method for a scalar function of a single variable says, given any xx such that f(x)0f(x)\neq 0, there should be a correction δ\delta such that

f(x+δ)=0f(x+\delta) = 0

Or expanding in a Taylor’s series around xx

f(x+δ)=f(x)+f(x)δ+O(fδ2)=0f(x+\delta) = f(x) + f'(x)\delta + O(f''\delta^2) = 0

Neglecting the higher order terms this becomes

f(x)+f(x)δ=0f(x) + f'(x)\delta = 0

or solving for the correction

δ=f(x)f(x)\delta =\frac{-f(x)}{f'(x)}

If the correction were exact it would follow that f(x+δ)=0f(x+\delta)=0 and we would have our solution.

In general, the higher-order terms are important so the residual is not zero, but this gives us an iterative scheme that we hope will converge

Set an initial guess x0x_0. Then iterate for k=0,1,2k=0,1,2\ldots

Solve:δk=f(xk)f(xk)xk+1=xk+δk\begin{align} \text{Solve:} & & \delta_k &= \frac{-f(x_k)}{f'(x_k)}\\ & & x_{k+1} &= x_k + \delta_k \end{align}

until f(xk)<tol|f(x_k)| < \text{tol}

If ff is well behaved with simple roots and the initial guess is within the basin of attraction of a given root, Newton can find the root with quadratic convergence.

It can also behave very badly and fly off to infinity...

Newton’s method: F(x)=0\mathbf{F}(\mathbf{x})=\mathbf{0}

Newton’s method for a vector valued function is actually very similar (without the comfort of things such as brackets)

Again for general x\mathbf{x}, F(x)0\mathbf{F}(\mathbf{x})\neq\mathbf{0} but is a vector often called the “residual”.

By the rules of vector norms

F(x)0||\mathbf{F}(\mathbf{x})|| \geq 0

and is only zero at a root of F\mathbf{F}.

If the residual is non-zero for some arbitrary x\mathbf{x}, Newton’s method assumes that there is some correction vector δ\boldsymbol{\delta} such that

F(x+δ)=0\mathbf{F}(\mathbf{x}+\boldsymbol{\delta})=\mathbf{0}

And we simply need to expand a vector-valued function in its Taylor series and truncate at linear terms to solve for the correction.

Preliminaries: Taylor Series expansion of f(x):RnRf(\mathbf{x}):\mathbb{R}^n\rightarrow\mathbb{R}.

if f(x)f(\mathbf{x}) is a scalar function of a vector valued variable (e.g. a surface like f1(xf_1(\mathbf{x}), then the Taylor series of

f(x+δ)=f(x)+i=1nfxi(x)δi+12j=1ni=1n2fxixj(x)δiδj+HOTf(\mathbf{x} +\boldsymbol{\delta}) = f(\mathbf{x}) + \sum_{i=1}^n \frac{\partial f}{\partial x_i}(\mathbf{x})\delta_i + \frac{1}{2}\sum_{j=1}^n\sum_{i=1}^n \frac{\partial^2 f}{\partial x_i\partial x_j}(\mathbf{x})\delta_i\delta_j + HOT

or more beautifully in vector notation

f(x+δ)=f(x)+(f)Tδ+12δTH(x)δ+HOTf(\mathbf{x} +\boldsymbol{\delta}) = f(\mathbf{x}) + (\nabla f)^T\boldsymbol{\delta} + \frac{1}{2}\boldsymbol{\delta}^TH(\mathbf{x})\boldsymbol{\delta} + HOT

where f\nabla f is the gradient of ff, and HH is the Hessian, a symmetric matrix whose components are

Hij=2fxixjH_{ij} = \frac{\partial^2 f}{\partial x_i\partial x_j}

Taylor Series expansion of F(x):RnRn\mathbf{F}(\mathbf{x}):\mathbb{R}^n\rightarrow\mathbb{R}^n.

This is readily extended to vector valued functions as

F(x+δ)=[F1(x+δ)F2(x+δ)Fn(x+δ)][F1(x)+(F1)TδF2(x)+(F2)TδFn(x)+(Fn)Tδ]+HOT\mathbf{F}(\mathbf{x}+\boldsymbol{\delta}) = \begin{bmatrix} F_1(\mathbf{x}+\boldsymbol{\delta}) \\ F_2(\mathbf{x}+\boldsymbol{\delta}) \\ \vdots \\ F_n(\mathbf{x}+\boldsymbol{\delta}) \\ \end{bmatrix} \approx \begin{bmatrix} F_1(\mathbf{x}) + (\nabla F_1)^T\boldsymbol{\delta} \\ F_2(\mathbf{x}) + (\nabla F_2)^T\boldsymbol{\delta} \\ \vdots \\ F_n(\mathbf{x}) + (\nabla F_n)^T\boldsymbol{\delta} \\ \end{bmatrix} + HOT

or more succinctly as

F(x+δ)F(x)+J(x)δ\mathbf{F}(\mathbf{x}+\boldsymbol{\delta}) \approx \mathbf{F}(\mathbf{x}) + J(\mathbf{x})\boldsymbol{\delta}

where JJ is the ``Jacobian’’ of F\mathbf{F}

The Jacobian

The Jacobian of F(x)\mathbf{F}(\mathbf{x}) is the matrix whose rows are the gradients of the individual functions Fi(x)F_i(\mathbf{x})

J=[F1TF2TFnT]J = \begin{bmatrix} \nabla F_1^T \\ \nabla F_2^T \\ \vdots \\ \nabla F_n^T\\ \end{bmatrix}

or whose components are

Jij=FixjJ_{ij} = \frac{\partial F_i}{\partial x_j}

i.e. all the partial derivatives of all the functions with respect to all the variables

Example

Our toy non-linear problem $$

x12+2x22=γ1x1+5x2=γ2\begin{align} x_1^2 + 2x_2^2 &=\gamma_1 \\ x_1 + 5x_2 &=\gamma_2 \\ \end{align}

$$

Can be rewritten in residual form as

F(x)=[x12+2x22γ1x1+5x2γ2]\mathbf{F}(\mathbf{x}) = \begin{bmatrix} x_1^2 + 2x_2^2 - \gamma_1 \\ x_1 + 5x_2 - \gamma_2 \\ \end{bmatrix}

with Jacobian?

J(x)=[2x14x215]J(\mathbf{x}) = \begin{bmatrix} 2x_1 & 4x_2 \\ 1 & 5 \\ \end{bmatrix}

Newton’s method for F(x)=0\mathbf{F}(\mathbf{x}) = \mathbf{0} (finally)

Given all the pieces, raw Newton for F(x):RnRn\mathbf{F}(\mathbf{x}):\mathbb{R}^n\rightarrow\mathbb{R}^n is simply

given an initial guess x\mathbf{x} such that F(x)0\mathbf{F}(\mathbf{x})\neq\mathbf{0}: seek a correction such that

F(x+δ)F(x)+Jδ=0\mathbf{F}(\mathbf{x}+\boldsymbol{\delta})\approx \mathbf{F}(\mathbf{x}) + J\boldsymbol{\delta} = \mathbf{0}

or solve

Jδ=F(x)J\boldsymbol{\delta} = -\mathbf{F}(\mathbf{x})

and correct xx+δ\mathbf{x}\leftarrow\mathbf{x}+\boldsymbol{\delta}

or as an iterative scheme:

Set an initial guess x0x_0. Then iterate for k=0,1,2k=0,1,2\ldots

Solve:J(xk)δk=F(xk)xk+1=xk+δk\begin{align} \text{Solve:} & & J(\mathbf{x}_k)\boldsymbol{\delta_k} &= -\mathbf{F}(\mathbf{x}_k)\\ & &\mathbf{x}_{k+1} &= \mathbf{x}_k + \boldsymbol{\delta}_k \end{align}

until F(xk)<tol||\mathbf{F}(\mathbf{x}_k)|| < \text{tol}

Minor digression: Linear systems

We can actually use Newton to solve linear systems as well. Consider the general linear problem

Ax=bA\mathbf{x} = \mathbf{b}

which we can rewrite in residual form as

F(x)=Axb=0\mathbf{F}(\mathbf{x}) = A\mathbf{x} - \mathbf{b} = \mathbf{0}

i.e. the solution of our original problem is also a root of F\mathbf{F}.

For any other initial guess x0\mathbf{x}_0, F(x0)0\mathbf{F}(\mathbf{x}_0)\neq\mathbf{0}. However we can easily find the correction δ\boldsymbol{\delta} as

F(x0+δ)=Ax0+Aδb=0=Aδ+F(x0)=0\begin{align} \mathbf{F}(\mathbf{x}_0+\boldsymbol{\delta})&= A\mathbf{x_0} + A\boldsymbol{\delta} -\mathbf{b} = \mathbf{0}\\ &=A\boldsymbol{\delta} + \mathbf{F}(\mathbf{x}_0) = \mathbf{0}\\ \end{align}

which we can solve immediately for the correction by solving the linear problem

Aδ=F(x0)A\boldsymbol{\delta} = -\mathbf{F}(\mathbf{x}_0)

Which is the same as the general Newton problem if A=JA=J

But it is!

if

F(x)=Axb\mathbf{F}(\mathbf{x}) = A\mathbf{x} - \mathbf{b}

Then in index notation

Fi(x)=jaijxjbiF_i(\mathbf{x}) = \sum_{j} a_{ij}x_j - b_i

And the Jacobian

Jij=Fixj=aijJ_{ij} = \frac{\partial F_i}{\partial x_j} = a_{ij}

or J=AJ = A

Or the solution of our original problem is

x=x0+δ=x0A1F(x0)\begin{align} \mathbf{x} &= \mathbf{x}_0 + \boldsymbol{\delta} \\ &= \mathbf{x}_0 - A^{-1}\mathbf{F}(\mathbf{x_0}) \end{align}

Which for a simple linear function F(x)=Axb\mathbf{F}(\mathbf{x}) = A\mathbf{x} - \mathbf{b} becomes

x=x0A1(Ax0b)=A1b\begin{align*} \mathbf{x} &= \mathbf{x}_0 - A^{-1}\left(A\mathbf{x}_0 - \mathbf{b}\right) \\ &= A^{-1}\mathbf{b}\\ \end{align*}

Thus we can solve for either the solution itself or the correction with the residual. These are equivalent

# Some quick and dirty Code
def newton(F, J, x0, tol=1.0e-6, MAX_ITS=100, verbose=True):
    """Solve F(x) = 0 using Newton's method until ||F(x)|| < tol or reaches Max iterations

    Params:
    -------
        F: calleable:
            Function returning residual F(x)
        J: calleable
            Function returning Jacobian J(x)
        tol: float
            stopping criteria for ||F|| < tol
        MAX_ITS: integer
            maximum number of iterations
        verbose: bool
            if true spits out norm of the residual at each iteration

    Returns:
    --------
        x: list
            list of points for each Newton iteration (just used for plotting intermediate results)
            the solution is x[-1]
    Raises:
    -----------
        ValueError if number of its exceeds MAX_ITS

    """
    x = [x0]
    for k in range(MAX_ITS + 1):
        xk = x[k]
        res = numpy.linalg.norm(F(xk))
        if verbose:
            print("k = {}, ||F|| = {}".format(k, res))
        delta = numpy.linalg.solve(J(xk), -F(xk))
        x.append(xk + delta)

        if res < tol:
            return numpy.array(x), k

    raise ValueError("Maximum number of iterations exceeded {}".format(MAX_ITS))

Let’s test with our toy problem

gamma = numpy.array([10, -2])

F = lambda x: numpy.array(
    [x[0] ** 2 + 2 * x[1] ** 2 - gamma[0], x[0] + 5 * x[1] - gamma[1]]
)

J = lambda x: numpy.array([[2 * x[0], 4 * x[1]], [1, 5]])
x0 = numpy.array([-0.5, 2])

x, nits = newton(F, J, x0)
print("\n", x)
xa = numpy.linspace(-4, 4, 100)
ya = numpy.linspace(-4, 4, 100)
X, Y = numpy.meshgrid(xa, ya)

Z1 = lambda gamma: X**2 + 2 * Y**2 - gamma
Z2 = lambda gamma: X + 5 * Y - gamma

fig = plt.figure(figsize=(8, 6))

axes = fig.add_subplot(1, 1, 1)
axes.contour(X, Y, Z1(gamma[0]), [0], colors="r")
axes.contour(X, Y, Z2(gamma[1]), [0], colors="b")
for xi in x:
    axes.plot(xi[0], xi[1], "gx", markersize=8)
axes.plot(xi[0], xi[1], "go", markersize=10)

axes.axis("equal")

axes.set_xlabel("$x_1$")
axes.set_ylabel("$x_2$")
axes.set_title("$\gamma_1={}, \gamma_2={}$".format(gamma[0], gamma[1]))
axes.grid()

plt.show()

A project idea:

Write an adaptive BDF-2/BDF-3 scheme that can solve a general non-linear problem and test it using the vander-pol oscillator. This turns out to be a lot more challenging than it looks for the vander-pol problem.