
Wolfe conditions
Encyclopedia
In the unconstrained minimization
problem, the Wolfe conditions are a set of inequalities for performing inexact line search, especially in quasi-Newton methods.
In these methods the idea is to find

for some smooth
. Each step often involves approximately solving the subproblem

where
is the current best guess,
is a search direction, and
is the step length.
Then inexact line searches provide an efficient way of computing an acceptable step length
that reduces the objective function 'sufficiently', rather than minimizing the objective function over
exactly. A line search algorithm can use Wolfe conditions as a requirement for any guessed
, before finding a new search direction
.
restricted to the direction
as
. A step length
is said to satisfy the Wolfe conditions if the following two inequalities hold:
with
. (In examining condition (ii), recall that to ensure that
is a descent direction, we have
.)
is usually chosen to quite small while
is much larger; Nocedal gives example values of
and
for Newton or quasi-Newton methods and
for the nonlinear conjugate gradient method
. Inequality i) is known as the Armijo rule and ii) as the curvature condition; i) ensures that the step length
decreases
'sufficiently', and ii) ensures that the slope has been reduced sufficiently.
. If we modify the curvature condition to the following,
then i) and iia) together form the so-called strong Wolfe conditions, and force
to lie close to a critical point
of
.
The principal reason for imposing the Wolfe conditions in an optimization algorithm where
is to ensure convergence of the gradient to zero. In particular, if the cosine of the angle between
and the gradient,
is bounded away from zero and the i) and ii) hold, then
.
An additional motivation, in the case of a quasi-Newton method
is that if
, where the matrix
is updated by the BFGS or DFP
formula, then if
is positive definite ii) implies
is also positive definite.
Optimization (mathematics)
In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
problem, the Wolfe conditions are a set of inequalities for performing inexact line search, especially in quasi-Newton methods.
In these methods the idea is to find

for some smooth
Smooth function
In mathematical analysis, a differentiability class is a classification of functions according to the properties of their derivatives. Higher order differentiability classes correspond to the existence of more derivatives. Functions that have derivatives of all orders are called smooth.Most of...


where



Then inexact line searches provide an efficient way of computing an acceptable step length




Armijo rule and curvature
Denote a univariate function



- i)
,
- ii)
,
with








Conjugate gradient method
In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is symmetric and positive-definite. The conjugate gradient method is an iterative method, so it can be applied to sparse systems that are too...
. Inequality i) is known as the Armijo rule and ii) as the curvature condition; i) ensures that the step length


Strong Wolfe condition on curvature
The Wolfe conditions, however, can result in a value for the step length that is not close to a minimizer of
- iia)
then i) and iia) together form the so-called strong Wolfe conditions, and force

Critical point (mathematics)
In calculus, a critical point of a function of a real variable is any value in the domain where either the function is not differentiable or its derivative is 0. The value of the function at a critical point is a critical value of the function...
of

The principal reason for imposing the Wolfe conditions in an optimization algorithm where


is bounded away from zero and the i) and ii) hold, then

An additional motivation, in the case of a quasi-Newton method
Quasi-Newton method
In optimization, quasi-Newton methods are algorithms for finding local maxima and minima of functions. Quasi-Newton methods are based on...
is that if


Davidon-Fletcher-Powell formula
The Davidon–Fletcher–Powell formula finds the solution to the secant equation that is closest to the current estimate and satisfies the curvature condition . It was the first quasi-Newton method which generalize the secant method to a multidimensional problem...
formula, then if

