# Newton's method

The Babylonian method is an algorithm to find an approximate value for $\sqrt{k}$. It was described by the first-century Greek mathematician Hero of Alexandria.

The method starts with some initial guess, called $x_0$. It then applies a formula to produce an improved guess. This is repeated until the improved guess is accurate enough or it is clear the algorithm fails to work.

For the Babylonian method, the next guess, $x_{i+1}$ derived from the current guess, $x_i$, is:

$~ x_{i+1} = \frac{1}{2}(x_i + \frac{k}{x_i}) ~$

We use this algorithm to approximate the square root of $2$, a value known to the Babylonians.

Start with $x$, then form $x/2 + 1/x$, from this again form $x/2 + 1/x$, repeat.

Let's look starting with $x = 2$ as a rational number:

x = 2//1
x = x//2 + 1//x
x, x^2.0

(3//2, 2.25)


Our estimate improved from something which squared to $4$ down to something which squares to $2.25$. A big improvement, but there is still more to come.

x = x//2 + 1//x
x, x^2.0

(17//12, 2.0069444444444446)


We now see accuracy until the third decimal point.

x = x//2 + 1//x
x, x^2.0

(577//408, 2.000006007304883)


This is now accurate to the sixth decimal point. That is about as far as we, or the Bablyonians, would want to go by hand. Using rational numbers quickly grows out of hand. The next step shows the explosion:

x = x//2 + 1//x

665857//470832


However, with the advent of floating point numbers, the method stays quite manageable:

x = 2.0
x = x/2 + 1/x   # 1.5, 2.25
x = x/2 + 1/x   # 1.4166666666666665, 2.006944444444444
x = x/2 + 1/x   # 1.4142156862745097, 2.0000060073048824
x = x/2 + 1/x   # 1.4142135623746899, 2.0000000000045106
x = x/2 + 1/x   # 1.414213562373095,  1.9999999999999996
x = x/2 + 1/x   # 1.414213562373095,  1.9999999999999996

1.414213562373095


We see that the algorithm - to the precision offered by floating point numbers - has resulted in an answer 1.414213562373095. This answer is an approximation to the actual answer. Approximation is necessary, as $\sqrt{2}$ is an irrational number and so can never be exactly represented in floating point. That being said, we see that the value of $f(x)$ is accurate to the last decimal place, so our approximation is very close and is achieved in a few steps.

## Newton's generalization

Let $f(x) = x^3 - 2x -5$. The value of 2 is almost a zero, but not quite, as $f(2) = -1$. We can check that there are no rational roots. Though there is a method to solve the cubic it may be difficult to compute and will not be as generally applicable as some algorithm like the Babylonian method to produce an approximate answer.

Is there some generalization to the Babylonian method?

We know that the tangent line is a good approximation to the function at the point. Looking at this graph gives a hint as to an algorithm:

The tangent line and the function nearly agree near $2$. So much so, that the intersection point of the tangent line with the $x$ axis nearly hides the actual zero of $f(x)$ that is near $2.1$.

That is, it seems that the intersection of the tangent line and the $x$ axis should be an improved approximation for the zero of the function.

Let $x_0$ be $2$, and $x_1$ be the intersection point of the tangent line at $(x_0, f(x_0))$ with the $x$ axis. Then by the definition of the tangent line:

$~ f'(x_0) = \frac{\Delta y }{\Delta x} = \frac{f(x_0)}{x_0 - x_1}. ~$

This can be solved for $x_1$ to give $x_1 = x_0 - f(x_0)/f'(x_0)$. In general, if we had $x_i$ and used the intersection point of the tangent line to produce $x_{i+1}$ we would have Newton's method:

$~ x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)}. ~$

We will use automatic derivatives, as possible, so load the CalculusWithJulia package which provides the f' notation for derivatives through the definition Base.adjoint(f::Function)=x->ForwardDiff.derivative(f, float(x)):

using CalculusWithJulia
using Plots


With this, the algorithm above starting from $2$ becomes:

x0 = 2
x1 = x0 - f(x0)/f'(x0)

2.1


We can see we are closer to a zero:

f(x0), f(x1)

(-1, 0.06100000000000083)


Trying again, we have

x2 = x1 - f(x1)/ f'(x1)
x2, f(x2), f(x1)

(2.094568121104185, 0.00018572317327247845, 0.06100000000000083)


And again:

x3 = x2 - f(x2)/ f'(x2)
x3, f(x3), f(x2)

(2.094551481698199, 1.7397612239733462e-9, 0.00018572317327247845)

x4 = x3 - f(x3)/ f'(x3)
x4, f(x4), f(x3)

(2.0945514815423265, -8.881784197001252e-16, 1.7397612239733462e-9)


We see now that $f(x_4)$ is within machine tolerance of $0$, so we call $x_4$ an approximate zero of $f(x)$.

Newton's method. Let $x_0$ be an initial guess for a zero of $f(x)$. Iteratively define $x_{i+1}$ in terms of the just generated $x_i$ by: $x_{i+1} = x_i - f(x_i) / f'(x_i)$. Then for most functions and reasonable initial guesses, the sequence of points converges to a zero of $f$.

On the computer, we know that actual convergence will likely never occur, but accuracy to a certain tolerance can often be achieved.

In the example above, we kept track of the previous values. This is unnecessary if only the answer is sought. In that case, the update step can use the same variable:

x = 2                     # x0
x = x - f(x) / f'(x)      # x1
x = x - f(x) / f'(x)      # x2
x = x - f(x) / f'(x)      # x3
x = x - f(x) / f'(x)      # x4

2.0945514815423265


As seen above, the assignment will update the value bound to x using the previous value of x in the computation.

We implement the algorithm by repeating the step until either we converge or it is clear we won't converge. For good guesses and most functions, convergence happens quickly.

## Examples

##### Example: visualizing convergence

This graphic demonstrates the method and the rapid convergence: Illustration of Newton's Method converging to a zero of a function.
##### Example: numeric not algebraic

For the function $f(x) = \cos(x) - x$, we see that SymPy can not solve symbolically for a zero:

@vars x real=true
solve(cos(x) - x, x)