To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Wolfe conditions

From Wikipedia, the free encyclopedia

In the unconstrained minimization problem, the Wolfe conditions are a set of inequalities for performing inexact line search, especially in quasi-Newton methods, first published by Philip Wolfe in 1969.[1][2]

In these methods the idea is to find

for some smooth . Each step often involves approximately solving the subproblem
where is the current best guess, is a search direction, and is the step length.

The inexact line searches provide an efficient way of computing an acceptable step length that reduces the objective function 'sufficiently', rather than minimizing the objective function over exactly. A line search algorithm can use Wolfe conditions as a requirement for any guessed , before finding a new search direction .

YouTube Encyclopedic

  • 1/5
    Views:
    10 708
    590
    6 256
    4 839
    3 363
  • Descent methods and line search: first Wolfe condition
  • Lecture 19B: Wolfe conditions
  • Descent methods and line search: second Wolfe condition
  • Descent methods and line search: validity of the Wolfe conditions
  • 5.12 Frank Wolfe

Transcription

Armijo rule and curvature

A step length is said to satisfy the Wolfe conditions, restricted to the direction , if the following two inequalities hold:

with . (In examining condition (ii), recall that to ensure that is a descent direction, we have , as in the case of gradient descent, where , or Newton–Raphson, where with positive definite.)

is usually chosen to be quite small while is much larger; Nocedal and Wright give example values of and for Newton or quasi-Newton methods and for the nonlinear conjugate gradient method.[3] Inequality i) is known as the Armijo rule[4] and ii) as the curvature condition; i) ensures that the step length decreases 'sufficiently', and ii) ensures that the slope has been reduced sufficiently. Conditions i) and ii) can be interpreted as respectively providing an upper and lower bound on the admissible step length values.

Strong Wolfe condition on curvature

Denote a univariate function restricted to the direction as . The Wolfe conditions can result in a value for the step length that is not close to a minimizer of . If we modify the curvature condition to the following,

then i) and iii) together form the so-called strong Wolfe conditions, and force to lie close to a critical point of .

Rationale

The principal reason for imposing the Wolfe conditions in an optimization algorithm where is to ensure convergence of the gradient to zero. In particular, if the cosine of the angle between and the gradient,

is bounded away from zero and the i) and ii) conditions hold, then .

An additional motivation, in the case of a quasi-Newton method, is that if , where the matrix is updated by the BFGS or DFP formula, then if is positive definite ii) implies is also positive definite.

Comments

Wolfe's conditions are more complicated than Armijo's condition, and a gradient descent algorithm based on Armijo's condition has a better theoretical guarantee than one based on Wolfe conditions (see the sections on "Upper bound for learning rates" and "Theoretical guarantee" in the Backtracking line search article).

See also

References

  1. ^ Wolfe, P. (1969). "Convergence Conditions for Ascent Methods". SIAM Review. 11 (2): 226–235. doi:10.1137/1011036. JSTOR 2028111.
  2. ^ Wolfe, P. (1971). "Convergence Conditions for Ascent Methods. II: Some Corrections". SIAM Review. 13 (2): 185–188. doi:10.1137/1013035. JSTOR 2028821.
  3. ^ Nocedal, Jorge; Wright, Stephen (1999). Numerical Optimization. p. 38.
  4. ^ Armijo, Larry (1966). "Minimization of functions having Lipschitz continuous first partial derivatives". Pacific J. Math. 16 (1): 1–3. doi:10.2140/pjm.1966.16.1.

Further reading

This page was last edited on 26 October 2023, at 15:42
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.