Root-finding algorithm
Encyclopedia
A root-finding algorithm is a numerical method, or algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

, for finding a value x such that f(x) = 0, for a given function
Function (mathematics)
In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...

 f. Such an x is called a root of the function f.

This article is concerned with finding scalar
Scalar (mathematics)
In linear algebra, real numbers are called scalars and relate to vectors in a vector space through the operation of scalar multiplication, in which a vector can be multiplied by a number to produce another vector....

, real
Real number
In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...

 or complex
Complex number
A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the one-dimensional number line to the two-dimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...

 roots, approximated as floating point numbers. Finding integer roots or exact algebraic roots are separate problems, whose algorithms have little in common with those discussed here. (See: Diophantine equation
Diophantine equation
In mathematics, a Diophantine equation is an indeterminate polynomial equation that allows the variables to be integers only. Diophantine problems have fewer equations than unknown variables and involve finding integers that work correctly for all equations...

 as for integer roots)

Finding a root of f(x)g(x) = 0 is the same as solving the equation
Equation
An equation is a mathematical statement that asserts the equality of two expressions. In modern notation, this is written by placing the expressions on either side of an equals sign , for examplex + 3 = 5\,asserts that x+3 is equal to 5...

 f(x) = g(x). Here, x is called the unknown in the equation. Conversely, any equation can take the canonical form
Canonical form
Generally, in mathematics, a canonical form of an object is a standard way of presenting that object....

 f(x) = 0, so equation solving
Equation solving
In mathematics, to solve an equation is to find what values fulfill a condition stated in the form of an equation . These expressions contain one or more unknowns, which are free variables for which values are sought that cause the condition to be fulfilled...

 is the same thing as computing (or finding) a root of a function.

Numerical root-finding methods use iteration
Iteration
Iteration means the act of repeating a process usually with the aim of approaching a desired goal or target or result. Each repetition of the process is also called an "iteration," and the results of one iteration are used as the starting point for the next iteration.-Mathematics:Iteration in...

, producing a sequence
Sequence
In mathematics, a sequence is an ordered list of objects . Like a set, it contains members , and the number of terms is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence...

 of numbers that hopefully converge towards a limit
Limit of a sequence
The limit of a sequence is, intuitively, the unique number or point L such that the terms of the sequence become arbitrarily close to L for "large" values of n...

 (the so called "fixed point
Fixed point (mathematics)
In mathematics, a fixed point of a function is a point that is mapped to itself by the function. A set of fixed points is sometimes called a fixed set...

") which is a root. The first values of this series are initial guesses. The method computes subsequent values based on the old ones and the function f.

The behaviour of root-finding algorithms is studied in numerical analysis
Numerical analysis
Numerical analysis is the study of algorithms that use numerical approximation for the problems of mathematical analysis ....

. Algorithms perform best when they take advantage of known characteristics of the given function. Thus an algorithm to find isolated real roots of a low-degree polynomial in one variable may bear little resemblance to an algorithm for complex roots of a "black-box" function which is not even known to be differentiable. Questions include ability to separate close roots, robustness in achieving reliable answers despite inevitable numerical errors, and rate of convergence.

Bisection Method

The simplest root-finding algorithm is the bisection method
Bisection method
The bisection method in mathematics is a root-finding method which repeatedly bisects an interval and then selects a subinterval in which a root must lie for further processing. It is a very simple and robust method, but it is also relatively slow...

. It works when f is a continuous function
Continuous function
In mathematics, a continuous function is a function for which, intuitively, "small" changes in the input result in "small" changes in the output. Otherwise, a function is said to be "discontinuous". A continuous function with a continuous inverse function is called "bicontinuous".Continuity of...

 and it requires previous knowledge of two initial guesses, a and b, such that f(a) and f(b) have opposite signs. Although it is reliable, it converges slowly, gaining one bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

 of accuracy with each iteration.

Newton's Method (and similar derivative-based methods)

Newton's method
Newton's method
In numerical analysis, Newton's method , named after Isaac Newton and Joseph Raphson, is a method for finding successively better approximations to the roots of a real-valued function. The algorithm is first in the class of Householder's methods, succeeded by Halley's method...

 assumes the function f to have a continuous derivative
Derivative
In calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much one quantity is changing in response to changes in some other quantity; for example, the derivative of the position of a...

. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method, and is usually quadratic. Newton's method is also important because it readily generalizes to higher-dimensional problems. Newton-like methods with higher order of convergence are the Householder's method
Householder's method
In numerical analysis, the class of Householder's methods are root-finding algorithms used for functions of one real variable with continuous derivatives up to some order d+1, where d will be the order of the Householder's method....

s. The first one after Newton's method is Halley's method
Halley's method
In numerical analysis, Halley’s method is a root-finding algorithm used for functions of one real variable with a continuous second derivative, i.e., C2 functions. It is named after its inventor Edmond Halley, who also discovered Halley's Comet....

 with cubic order of convergence.

Secant Method

Replacing the derivative in Newton's method with a finite difference
Finite difference
A finite difference is a mathematical expression of the form f − f. If a finite difference is divided by b − a, one gets a difference quotient...

, we get the secant method
Secant method
In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite difference approximation of Newton's method. However, the method was developed...

. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order is approximately 1.6). A generalization of the secant method in higher dimensions is Broyden's method
Broyden's method
In numerical analysis, Broyden's method is a quasi-Newton method for the numerical solution of nonlinear equations in k variables. It was originally described by C. G. Broyden in 1965....

.

False Position (Regula Falsi)

The false position method
False position method
The false position method or regula falsi method is a term for problem-solving methods in algebra and calculus. In simple terms, these methods begin by attempting to evaluate a problem using test values for the variables, and then adjust the values accordingly.-Algebra:In algebra, the false...

, also called the regula falsi method, is like the secant method. However, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method is faster than the bisection method and more robust than the secant method, but requires the two starting points to bracket the root. Ridders' method
Ridders' method
In numerical analysis, Ridders' method is a root-finding algorithm based on the false position method and the use of an exponential function to successively approximate a root of a function f....

 is a variant on the false-position method that also evaluates the function at the midpoint of the interval, giving faster convergence with similar robustness.

Interpolation

The secant method also arises if one approximates the unknown function f by linear interpolation
Linear interpolation
Linear interpolation is a method of curve fitting using linear polynomials. Lerp is an abbreviation for linear interpolation, which can also be used as a verb .-Linear interpolation between two known points:...

. When quadratic interpolation
Polynomial interpolation
In numerical analysis, polynomial interpolation is the interpolation of a given data set by a polynomial: given some points, find a polynomial which goes exactly through these points.- Applications :...

 is used instead, one arrives at Müller's method
Müller's method
Müller's method is a root-finding algorithm, a numerical method for solving equations of the form f = 0. It is first presented by D. E. Müller in 1956....

. It converges faster than the secant method. A particular feature of this method is that the iterates xn may become complex
Complex number
A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the one-dimensional number line to the two-dimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...

.

Inverse Interpolation

This can be avoided by interpolating the inverse
Inverse function
In mathematics, an inverse function is a function that undoes another function: If an input x into the function ƒ produces an output y, then putting y into the inverse function g produces the output x, and vice versa. i.e., ƒ=y, and g=x...

 of f, resulting in the inverse quadratic interpolation
Inverse quadratic interpolation
In numerical analysis, inverse quadratic interpolation is a root-finding algorithm, meaning that it is an algorithm for solving equations of the form f = 0. The idea is to use quadratic interpolation to approximate the inverse of f...

 method. Again, convergence is asymptotically faster than the secant method, but inverse quadratic interpolation often behaves poorly when the iterates are not close to the root.

Brent's Method

Brent's method
Brent's method
In numerical analysis, Brent's method is a complicated but popular root-finding algorithm combining the bisection method, the secant method and inverse quadratic interpolation. It has the reliability of bisection but it can be as quick as some of the less reliable methods...

 is a combination of the bisection method, the secant method and inverse quadratic interpolation. At every iteration, Brent's method decides which method out of these three is likely to do best, and proceeds by doing a step according to that method. This gives a robust and fast method, which therefore enjoys considerable popularity.

Finding roots of polynomials

Much attention has been given to the special case that the function f is a polynomial
Polynomial
In mathematics, a polynomial is an expression of finite length constructed from variables and constants, using only the operations of addition, subtraction, multiplication, and non-negative integer exponents...

; there exist root-finding algorithms exploiting the polynomial nature of f. For a univariate polynomial of degree less than five, we have closed form solutions such as the quadratic formula. However, even this degree-two solution should be used with care to ensure numerical stability. The degree-four solution is unwieldy and troublesome. Higher-degree polynomials have no such general solution, according to the Abel–Ruffini theorem
Abel–Ruffini theorem
In algebra, the Abel–Ruffini theorem states that there is no general algebraic solution—that is, solution in radicals— to polynomial equations of degree five or higher.- Interpretation :...

 (1824, 1799).

Birge-Vieta's method combines Horner's method of polynomial evaluation with Newton-Raphson to provide a computational speed-up.

For real roots, Sturm's theorem
Sturm's theorem
In mathematics, Sturm's theorem is a symbolic procedure to determine the number of distinct real roots of a polynomial. It was named for Jacques Charles François Sturm...

 and Descartes' rule of signs
Descartes' rule of signs
In mathematics, Descartes' rule of signs, first described by René Descartes in his work La Géométrie, is a technique for determining the number of positive or negative real roots of a polynomial....

 provide guides to locating and separating roots. This plus interval arithmetic
Interval arithmetic
Interval arithmetic, interval mathematics, interval analysis, or interval computation, is a method developed by mathematicians since the 1950s and 1960s as an approach to putting bounds on rounding errors and measurement errors in mathematical computation and thus developing numerical methods that...

 combined with Newton's method
Newton's method
In numerical analysis, Newton's method , named after Isaac Newton and Joseph Raphson, is a method for finding successively better approximations to the roots of a real-valued function. The algorithm is first in the class of Householder's methods, succeeded by Halley's method...

 yields robust algorithms. The algorithm for isolating the roots, using Descartes' rule of signs, has been called Uspensky's algorithm by its inventors Collins and Akritas. It has been improved by Rouillier and Zimmerman. It has the same worst case complexity
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...

 as Sturm algorithm, but is almost always much faster. It is the default algorithm of Maple
Maple (software)
Maple is a general-purpose commercial computer algebra system. It was first developed in 1980 by the Symbolic Computation Group at the University of Waterloo in Waterloo, Ontario, Canada....

 function fsolve.

One possibility is to form the companion matrix of the polynomial. Since the eigenvalues of this matrix coincide with the roots of the polynomial, one can use any eigenvalue algorithm
Eigenvalue algorithm
In linear algebra, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.-Characteristic polynomial:...

 to find the roots of the polynomial. For instance the classical Bernoulli's method to find the root greater in modulus, if it exists, turns out to be the power method applied to the companion matrix. The inverse power method, which finds some smallest root first, is what drives the Jenkins–Traub method and gives it its numerical stability and fast convergence even in the presence of multiple or clustered roots.

Laguerre's method
Laguerre's method
In numerical analysis, Laguerre's method is a root-finding algorithm tailored to polynomials. In other words, Laguerre's method can be used to solve numerically the equation\ p = 0 for a given polynomial p...

, as well as Halley's method
Halley's method
In numerical analysis, Halley’s method is a root-finding algorithm used for functions of one real variable with a continuous second derivative, i.e., C2 functions. It is named after its inventor Edmond Halley, who also discovered Halley's Comet....

, use second order derivatives and complex arithmetics, including the complex square root, to exhibit cubic convergence for simple roots, dominating the quadratic convergence displayed by Newton's method.

Bairstow's method
Bairstow's method
In numerical analysis, Bairstow's method is an efficient algorithm for finding the roots of a real polynomial of arbitrary degree. The algorithm first appeared in the appendix of the 1920 book "Applied Aerodynamics" by Leonard Bairstow...

 uses Newton's method to find quadratic factors of a polynomial with real coefficients. It can determine both real and complex roots of a real polynomial using only real arithmetic.

The simple Durand–Kerner and the slightly more complicated Aberth method
Aberth method
The Aberth method, or Aberth–Ehrlich method, named after Oliver Aberth and Louis W. Ehrlich, is a root-finding algorithm for simultaneous approximation of all the roots of a univariate polynomial....

 simultaneously finds all the roots using only simple complex number
Complex number
A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the one-dimensional number line to the two-dimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...

 arithmetic.

The splitting circle method
Splitting circle method
In mathematics, the splitting circle method is a numerical algorithm for the numerical factorization of a polynomial and, ultimately, for finding its complex roots. It was introduced by Arnold Schönhage in his 1982 paper The fundamental theorem of algebra in terms of computational complexity...

 is useful for finding the roots of polynomials of high degree to arbitrary precision; it has almost optimal complexity in this setting. Another method with this style is the Dandelin–Gräffe method (actually due to Lobachevsky
Nikolai Ivanovich Lobachevsky
Nikolai Ivanovich Lobachevsky was a Russian mathematician and geometer, renowned primarily for his pioneering works on hyperbolic geometry, otherwise known as Lobachevskian geometry...

) which factors the polynomial.

Wilkinson's polynomial illustrates that high precision
Precision (arithmetic)
The precision of a value describes the number of digits that are used to express that value. In a scientific setting this would be the total number of digits or, less commonly, the number of fractional digits or decimal places...

 may be necessary when computing the roots of a polynomial given its coefficients: the problem of finding the roots from the coefficients is in general ill-conditioned.

Finding multiple roots of polynomials

If p(x) is a polynomial with a multiple root
Multiplicity (mathematics)
In mathematics, the multiplicity of a member of a multiset is the number of times it appears in the multiset. For example, the number of times a given polynomial equation has a root at a given point....

 at r, then finding the value of r can be difficult (inefficient or impossible) for many of the standard root-finding algorithms. Fortunately, there is a technique especially for this case, provided that p is given explicitly as a polynomial in one variable with exact coefficients.

Algorithm

  1. First, we need to determine whether p(x) has a multiple root. If p(x) has a multiple root at r, then its derivative p′(x) will also have a root at r (one fewer than p(x) has there). So we calculate the greatest common divisor of the polynomials p(x) and p′(x); adjust the leading coefficient to be one and call it g(x). (See Sturm's theorem
    Sturm's theorem
    In mathematics, Sturm's theorem is a symbolic procedure to determine the number of distinct real roots of a polynomial. It was named for Jacques Charles François Sturm...

    .) If g(x) = 1, then p(x) has no multiple roots and we can safely use those other root-finding algorithms which work best when there are no multiple roots, and then exit.
  2. Now suppose that there is a multiple root. Notice that g(x) will have a root of the same multiplicity at r that p′(x) has and the degree of the polynomial g(x) will generally be much less than that of p(x). Recursively call this routine, i.e. go back to step #1 above, using g(x) in place of p(x). Now suppose that we have found the roots of g(x), i.e. we have factored it.
  3. Since r has been found, we can factor (xr) out of p(x) repeatedly until we have removed all of the roots at r. Repeat this for any other multiple roots until there are no more multiple roots. Then the quotient, i.e. the remaining part of p(x), can be factored in the usual way with one of the other root-finding algorithms. Exit.

Example

Suppose p(x) = x3+x2−5x+3
is the function whose roots we want to find.
We calculate p′(x) = 3x2+2x−5.
Now divide p′(x) into p(x) to get p(x) = p′(x)·((1/3)x+(1/9))+((−32/9)x+(32/9)).
Divide the remainder by −32/9 to get x−1 which is monic.
Divide x−1 into p′(x) to get p′(x) = (x−1)·(3x+5)+0.
Since the remainder is zero, g(x) = x−1. So the multiple root of p(x) is r = 1.
Dividing p(x) by (x−1)2, we get p(x) = (x+3)(x−1)2 so the other root is −3, a single root.

Direct algorithm for multiple root elimination

There is a direct method of eliminating multiple (or repeated) roots from polynomials with exact coefficients (integers, rational numbers, Gaussian integers or rational complex numbers).

Suppose a is a root of polynomial P, with multiplicity m>1.
Then a will be a root of the formal derivative P ’, with multiplicity m−1.
However, P ’ may have additional roots that are not roots of P.
For example, if P(x)=(x−1)3(x−3)3, then P ’(x)=6(x−1)2(x−2)(x−3)2. So 2 is a root of P ’, but not of P.

Define G to be the greatest common divisor
Greatest common divisor
In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...

of P and P ’. (Say, G(x) = (x−1)2(x−3)2).

Finally, G divides P exactly, so form the quotient Q =P/G. (Say, Q(x)= (x−1)(x−3) ). The roots of Q are the roots of P, with multiple roots reduced down to single roots.

As P is a polynomial with exact coefficients, then if the algorithm is executed using exact arithmetic, Q will also be a polynomial with exact coefficients.

Obviously degree(Q) < degree(P). If degree(Q) ≤ 4 then the multiple roots of P may be found algebraically. It is then possible to determine the multiplicities of those roots in P algebraically.

Iterative root-finding algorithms perform well in the absence of multiple roots.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK