Root finding

The roots of a function are the set of points at which that function evaluates to zero. Roots are an important concept since many numerical problems can be reduced to solving equations of the form \(\mathsf{f(x) = 0}\). The zero here is completely general; it includes the following cases:

  • Crossing at an arbitrary height \(\mathsf{f(x) = c}\) can be rewritten as \(\mathsf{f(x) - c = 0}\).

  • Intersection of two functions \(\mathsf{f(x) = g(x)}\) is equivalent to \(\mathsf{f(x) - g(x) = 0}\).

We want to consider algorithms that can efficiently compute the roots of \(\mathsf{f}\), some function of a single variable. In the case of a polynomial written as a product of monomial factors, \(\mathsf{f(x)} = \mathsf{(x-r_1)(x-r_2)} \cdots\mathsf{(x-r_N)}\), picking out the roots is trivial. They are just the points \({ \mathsf{r_1}, \mathsf{r_2}, \ldots, \mathsf{r_N} }\). Thus, for polynomials, root-finding amounts to factorization. (This in itself is a hard problem.) For non-polynomial functions, we need a more general framework.

Bisection method

Suppose we can identify an interval \([\mathsf{a,b}]\) in which \(\mathsf{f(a)f(b) < 0}\). Provided that \(\mathsf{f(x)}\) is continuous, the fact that the values \(\mathsf{f(a)}\) and \(\mathsf{f(b)}\) have opposite size guarantees that \(\mathsf{f(x)}\) changes sign somewhere in between (via the intermediate value theorem). That is, there exists an \(\mathsf{x_0}\) in \([\mathsf{a,b}]\) such that \(\mathsf{f(x_0) = 0}\).

Another way to think about this is that if \(\mathsf{f(a)f(b) < 0}\), then \(\mathsf{f(x)}\), must cross the x-axis an odd number of times \((\mathsf{1}, \mathsf{3}, \mathsf{5}, \mathsf{7}, \ldots)\) between \(\mathsf{a}\) and \(\mathsf{b}\); whereas if \(\mathsf{f(a)f(b) > 0}\), then \(\mathsf{f(x)}\) must cross it an even number of times, including the possibility of zero \((\mathsf{0}, \mathsf{2}, \mathsf{4}, \mathsf{6}, \ldots)\).

Let’s assume that we can make an initial guess for \([\mathsf{a,b}]\) that does lead to an interior point satisfying \(\mathsf{f(x_0) = 0}\). Then we can subdivide the interval into subintervals \([\mathsf{a,c}]\) and \([\mathsf{c,b}]\), where \(\mathsf{c = (a+b)/2}\) is the midpoint. There are now only two possibilities, either \(\mathsf{f(a)f(c) < 0}\) or \(\mathsf{f(c)f(b) < 0}\), which correspond to the crossing point being in \([\mathsf{a,c}]\) or \([\mathsf{c,b}]\). The method can be applied again on the appropriate subinterval; the recursion ends when the interval bracketing the root shrinks to a specified tolerance. Since the interval repeatedly halves in size, one bit of significance is gained with each iteration.

Here is a sample implementation:

#include <cassert>

double bisection_method_v1( double (&f)(double), double a, double b, double tolerance)
{
   assert(f(a)*f(b) < 0.0);
   assert(a < b);
   const double width = b-a;
   const double c = 0.5*(a+b);
   if (width < tolerance) return c;
   if (f(a)*f(c) < 0.0) return bisection_method_v1(f,a,c,tolerance);
   return bisection_method_v1(f,c,b,tolerance);
}

Here is another that does not use recursion:

double bisection_method_v2( double (&f)(double), double a, double b, double tolerance)
{
   assert(f(a)*f(b) < 0.0);
   assert(a < b);
   double c = 0.5*(a+b);
   do
   {
      if (f(a)*f(c) < 0.0)
         b = c;
      else
      {
         assert(f(c)*f(b) < 0.0);
         a = c;
      }
      assert(a < b);
      c = 0.5*(a+b);
   } while (b - a < tolerance);
   return c;
}

Why do we bother with a tolerance variable at all? Can’t we just loop while (f(c) != 0.0)? No! There’s no reason to expect that the variable c will ever be close enough to the root that f(c) vanishes. Remember that the true value of the root may not have a floating-point representation. Moreover, the function itself may have truncation or round-off errors associated with its own internal computation.

Recap: The bisection method is applicable if we can find an interval \([\mathsf{a,b}]\) such that \(\mathsf{f(x)}\) is continuous throughout the interval and \(\mathsf{f(a)f(b) < 0}\). It is guaranteed to approach the root, but it does so slowly with a one-bit improvement in the estimate for each iteration.

Newton’s Method

An alternative method, developed independently by Newton and Raphson, attempts to improve on an initial guess by linearly extrapolating from the tangent at that point. Assume that our guess \(\mathsf{x}^*\) is some small distance \(\Delta\mathsf{x}\) from the root \(\mathsf{x_0} = \mathsf{x}^* + \Delta \mathsf{x}\). A first order Taylor series expansion of \(\mathsf{0} = \mathsf{f(x_0)} = \mathsf{f(x^* + \Delta x)} \approx \mathsf{f(x^*)} + \mathsf{f'(x^*)\Delta}\) suggests that the discrepancy is \(\mathsf{\Delta x} \approx -\mathsf{f(x^*)/f'(x^*)}\). From this we can build a sequence—let’s call it \((\mathsf{x_n})\) now—of improved guesses constructed according to the following rule:

\[\mathsf{x_{n+1}} := \mathsf{x_n} - \mathsf{\frac{f(x_n)}{f'(x_n)}}\]

For Newton’s method, it is no longer enough that \(\mathsf{f(x)}\) be continuous. It must also be smooth: i.e., \(\mathsf{f'(x)}\) must be continuous too. There is the further problem that the method is not guaranteed to converge. In fact, it’s very likely not to if the initial guess is too far from the root (as determined by the local curvature). It’s advantage over bisection is that when it does converge, it does so quadratically: the number of accurate bits doubles with each iteration.

Try this on your own

Heron’s method for finding the square root of 2 can be understood as an application of Newton-Raphson root finding to the function \(\mathsf{f(x)} = \mathsf{x^2 - 2}\). Show that this leads to the recursion relation \(\mathsf{x_{n+1}} := \mathsf{x_n/2} - \mathsf{1/x_n}.\) Run this on your computer and confirm numerically that the resulting sequence converges to the desired value.

A closer look at Newton’s method

Newton’s method is an iterative root-finding scheme that improves on the current guess \(\mathsf{x_n}\) by extrapolating to the x-axis along the local tangent. We derived the recursion rule by keeping only the linear term in a Taylor expansion around the guess.

The requirements for this method are as follows:

  • Unlike the bisection method, we do not need to find an interval \([\mathsf{a,b}]\) that encloses the root.

  • We do need is a good first guess.

  • We have to have some reliable way of computing \(\mathsf{f'(x)}\).

  • The function \(\mathsf{f}\) must be smooth and continuous.

The method often converges very fast, but it is not entirely reliable.

  • Unlike the bisection method, it is not guaranteed to converge

  • There is usually a basin of attraction around each root (whose size is hard to predict)

  • Some initial guesses will cause the method to fail immediately or to cycle endlessly

There are other advantages. Unlike the bisection method, the Newton method is generalizable

  • to complex-values functions

  • to higher dimensions

The ideal situation is when \(\mathsf{f(x_n)} \to \mathsf{0}\) as \(\mathsf{n} \to \infty\) while \(\mathsf{f'(x_n)}\) remains bounded away from zero. In that case, the step sizes get smaller with each iteration and there are no numerical issues associated with the floating-point division. If \(\mathsf{f(x_n)}\) also goes to zero (e.g., in a polynomial with a root of higher multiplicity) then in principle the convergence is downgraded from quadratic to linear. In practice, the method may break down completely if the floating-point calculation of \(\mathsf{f(x_n)/f'(x_n)}\) is sufficiently close to a 0/0.

Remember that the recursion relation was derived as a Taylor expansion around a guess \(\mathsf{x}^*\), truncated at first order. We may run into trouble if, by virtue of \(\mathsf{f(x^*)} \approx \mathsf{0}\), the second order term is comparable to the first within machine accuracy.

Similar considerations apply if we don’t have access to \(\mathsf{f'(x)}\). If we’re very lucky, we have analytic expressions for both: e.g.,

double f(double) { return cos(x) - x*x*x; }
double fprime(double} { return -sin(x) - 3*x*x; }

More likely, \(\mathsf{f(x)}\) is some complicated, black-box calculation, and there’s no possibility of determining an analytic expression for its derivative. In that case, we have to resort to calculating the derivative numerically. We can do something along he lines of

\[\mathsf{f'(x)} = \textsf{lim}_{\mathsf{h} \to \mathsf{0}} \mathsf{\frac{f(x+h)-f(x)}{h}} \]

—i.e., we can try to compute \(\mathsf{f(x)}\) at two nearby points. The question is, how near? If \(\math{h}\) is too small, we run into a 0/0 floating-point catastrophe. But we do want to make sure that \(\mathsf{h}\) is small enough that the second order terms in the Taylor expansion

\[\mathsf{f(x+h)} = \mathsf{f(x)} + \mathsf{f'(x)h} + \mathsf{\frac{1}{2}f''(x)h^2} + \cdots\]

are negligible. Of course, checking whether \(|\mathsf{f'(x)}|\mathsf{h} \ll \mathsf{(1/2)}|\mathsf{f''(x)}|\mathsf{h^2}\) is impossible since we know nothing about \(\mathsf{f''(x)}\).

A popular choice is to replace the derivative by a finite difference constructed from the two previous guesses:

\[\mathsf{x_{n+1}} = \mathsf{x_n} - \mathsf{\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})} f(x_n)}\]

This usually goes by the name Secant method (a secant is a line that intersects two points on a curve). Now, \(\mathsf{x_{n+1}}\) depends on both \(\mathsf{x_n}\) and \(\mathsf{x_{n-1}}\), so two initial guesses are required. Convergence is intermediate between linear and quadratic: the number of accurate bits is grows by the power 1.618 (the golden ratio) for each iteration. This method only behaves well if the root is simple.

Connection to extremum problems

One of the most common procedures we need to carry out in scientific problems is an extremization of a function \(\mathsf{F(x)}\) over some range of its arguments. Unless the extremum—let’s call it \(\mathsf{x_e}\)—lies along the boundary of the phase space of \(\mathsf{x}\), it will satisfy \(\mathsf{F'(x_e) = 0}\) (or \(\mathsf{\nabla F = 0}\) in higher dimensions). In other words, it is one of the roots of \(\mathsf{f(x) = F'(x)}\).

In terms of \(\mathsf{F(x)}\), Newton’s method can be understood as a sequence of steps

\[\mathsf{\Delta x} = \mathsf{x_{n-1}} - \mathsf{x_n} = -\mathsf{\frac{F'(x_n)}{F''(x_n)}}\]

If we imagine those steps taking place sequentially at discrete instants \(\mathsf{t_n= n\Delta t}\) of fictitious time [with \(\mathsf{x(t_{n+1})} = \mathsf{x_{n+1}}\) occuring later than \(\mathsf{x(t_n)} = \mathsf{x_n}\)], then we can understand Newton’s method as the flow of the variable \(\mathsf{x}\) either uphill of downhill (depending on the sign of \(\mathsf{F''}\)) with respect to the landscape defined by \(\mathsf{F(x)}\):

For Newton’s method, the size of the change \(\Delta \mathsf{x}\) between discrete time points is proportional to the inverse of the second derivative (or by the Hessian matrix in higher dimensions), which has a variable magnitude and sign. If we fix the sign to be positive, then the flow of x will always be downhill. Then we can regard the magnitude of the step as one possibility in a family of methods that follow \(\mathsf{F(x)}\) downhill. Indeed, in the vicinity of a minimum, Newton’s method is just one variant of the local optimization strategy called steepest descent.