Training a neural network Using Newton’s Method

Newton Method

Newtons method is based on the observation that using a second derivative in addition to the first one can help to get a better approximation. The resulting function is no longer linear but quadratic.

Now to find the next point X3 the process can be repeated

Conclusion Newtons Method vs Gradient Descent

Gradient decent is a first order function that use the derivative of that function to find the minimal. Newton’s method is a root finding algorithm that use a second order derivative to find the minimal of that function. A second order derivative can be faster only if is known and can be computed easily, but most of the case this will require a lot of computation and can be expensive. If a 𝑁 is require for the first derivative a 𝑁2will be required to find the second one.



