Wolfe conditions - AbsoluteAstronomy.com

In the unconstrained minimization

Optimization (mathematics)

In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....

problem, the Wolfe conditions are a set of inequalities for performing inexact line search, especially in quasi-Newton methods.

In these methods the idea is to find

for some smooth

Smooth function

In mathematical analysis, a differentiability class is a classification of functions according to the properties of their derivatives. Higher order differentiability classes correspond to the existence of more derivatives. Functions that have derivatives of all orders are called smooth.Most of...

. Each step often involves approximately solving the subproblem

where

is the current best guess,

is a search direction, and

is the step length.

Then inexact line searches provide an efficient way of computing an acceptable step length

that reduces the objective function 'sufficiently', rather than minimizing the objective function over

exactly. A line search algorithm can use Wolfe conditions as a requirement for any guessed

, before finding a new search direction

Armijo rule and curvature

Denote a univariate function

restricted to the direction

. A step length

is said to satisfy the Wolfe conditions if the following two inequalities hold:

ii)

with

. (In examining condition (ii), recall that to ensure that

is a descent direction, we have

is usually chosen to quite small while

is much larger; Nocedal gives example values of

and

for Newton or quasi-Newton methods and

for the nonlinear conjugate gradient method

Conjugate gradient method

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is symmetric and positive-definite. The conjugate gradient method is an iterative method, so it can be applied to sparse systems that are too...

. Inequality i) is known as the Armijo rule and ii) as the curvature condition; i) ensures that the step length

decreases

'sufficiently', and ii) ensures that the slope has been reduced sufficiently.

Strong Wolfe condition on curvature

The Wolfe conditions, however, can result in a value for the step length that is not close to a minimizer of

. If we modify the curvature condition to the following,

iia)

then i) and iia) together form the so-called strong Wolfe conditions, and force

to lie close to a critical point

Critical point (mathematics)

In calculus, a critical point of a function of a real variable is any value in the domain where either the function is not differentiable or its derivative is 0. The value of the function at a critical point is a critical value of the function...

.

The principal reason for imposing the Wolfe conditions in an optimization algorithm where

is to ensure convergence of the gradient to zero. In particular, if the cosine of the angle between

and the gradient,

is bounded away from zero and the i) and ii) hold, then

.

An additional motivation, in the case of a quasi-Newton method

Quasi-Newton method

In optimization, quasi-Newton methods are algorithms for finding local maxima and minima of functions. Quasi-Newton methods are based on...

is that if

, where the matrix

is updated by the BFGS or DFP

Davidon-Fletcher-Powell formula

The Davidon–Fletcher–Powell formula finds the solution to the secant equation that is closest to the current estimate and satisfies the curvature condition . It was the first quasi-Newton method which generalize the secant method to a multidimensional problem...

formula, then if

is positive definite ii) implies

is also positive definite.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.