Fermat's theorem (stationary points)

In mathematics, Fermat's theorem (not to be confused with Fermat's last theorem

Fermat's Last Theorem

In number theory, Fermat's Last Theorem states that no three positive integers a, b, and c can satisfy the equation an + bn = cn for any integer value of n greater than two....

) is a method to find local maxima and minima of differentiable function

Differentiable function

In calculus , a differentiable function is a function whose derivative exists at each point in its domain. The graph of a differentiable function must have a non-vertical tangent line at each point in its domain...

s on open sets by showing that every local extremum

Maxima and minima

In mathematics, the maximum and minimum of a function, known collectively as extrema , are the largest and smallest value that the function takes at a point either within a given neighborhood or on the function domain in its entirety .More generally, the...

of the function is a stationary point

Stationary point

In mathematics, particularly in calculus, a stationary point is an input to a function where the derivative is zero : where the function "stops" increasing or decreasing ....

(the function derivative

Derivative

In calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much one quantity is changing in response to changes in some other quantity; for example, the derivative of the position of a...

is zero in that point). Fermat's theorem is a theorem

Theorem

In mathematics, a theorem is a statement that has been proven on the basis of previously established statements, such as other theorems, and previously accepted statements, such as axioms...

in real analysis

Real analysis

Real analysis, is a branch of mathematical analysis dealing with the set of real numbers and functions of a real variable. In particular, it deals with the analytic properties of real functions and sequences, including convergence and limits of sequences of real numbers, the calculus of the real...

, named after Pierre de Fermat

Pierre de Fermat

Pierre de Fermat was a French lawyer at the Parlement of Toulouse, France, and an amateur mathematician who is given credit for early developments that led to infinitesimal calculus, including his adequality...

.

By using Fermat's theorem, the potential extrema of a function

, with derivative

, are found by solving an equation

Equation

An equation is a mathematical statement that asserts the equality of two expressions. In modern notation, this is written by placing the expressions on either side of an equals sign , for examplex + 3 = 5\,asserts that x+3 is equal to 5...

. Fermat's theorem gives only a necessary condition for extreme function values, and some stationary points are inflection point

Inflection point

In differential calculus, an inflection point, point of inflection, or inflection is a point on a curve at which the curvature or concavity changes sign. The curve changes from being concave upwards to concave downwards , or vice versa...

s (not a maximum or minimum). The function's second derivative

Second derivative

In calculus, the second derivative of a function ƒ is the derivative of the derivative of ƒ. Roughly speaking, the second derivative measures how the rate of change of a quantity is itself changing; for example, the second derivative of the position of a vehicle with respect to time is...

, if it exists, can determine if any stationary point is a maximum, minimum, or inflection point.

Fermat's theorem

Let

be a function

Function (mathematics)

In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...

and suppose that

is a local extremum of

. If

is differentiable

Differentiable function

then

.

Another way to understand the theorem is via the contrapositive statement:

If is differentiable
Differentiable function
In calculus , a differentiable function is a function whose derivative exists at each point in its domain. The graph of a differentiable function must have a non-vertical tangent line at each point in its domain...

at , and
,
then is not an extremum of f.

Exactly the same statement is true in higher dimensions, with the proof requiring only slight generalization.

Application to optimization

As a corollary

Corollary

A corollary is a statement that follows readily from a previous statement.In mathematics a corollary typically follows a theorem. The use of the term corollary, rather than proposition or theorem, is intrinsically subjective...

, global extrema of a function f on a domain A occur only at boundaries

Bounded function

In mathematics, a function f defined on some set X with real or complex values is called bounded, if the set of its values is bounded. In other words, there exists a real number M...

, non-differentiable points, and stationary points.
If

is a global extremum of f, then one of the following is true:

boundary: is in the boundary of A
non-differentiable: f is not differentiable at
stationary point: is a stationary point of f

Intuition

Intuitively, a differentiable function is approximated by its derivative – a differentiable function behaves infinitesimally like a linear function

Linear function

In mathematics, the term linear function can refer to either of two different but related concepts:* a first-degree polynomial function of one variable;* a map between two vector spaces that preserves vector addition and scalar multiplication....

or more precisely,

Thus, from the perspective that "if f is differentiable and has non-vanishing derivative at

then it does not attain an extremum at

" the intuition is that if the derivative at

is positive, the function is increasing near

while if the derivative is negative, the function is decreasing near

In both cases, it cannot attain a maximum or minimum, because its value is changing. It can only attain a maximum or minimum if it "stops" – if the derivative vanishes (or if it is not differentiable, or if one runs into the boundary and cannot continue). However, making "behaves like a linear function" precise requires careful analytic proof.

More precisely, the intuition can be stated as: if the derivative is positive, there is some point to the right of

where f is greater, and some point to the left of

where f is less, and thus f attains neither a maximum nor a minimum at

Conversely, if the derivative is negative, there is a point to the right which is lesser, and a point to the left which is greater. Stated this way, the proof is just translating this into equations and verifying "how much greater or less".

The intuition

Intuition (knowledge)

Intuition is the ability to acquire knowledge without inference or the use of reason. "The word 'intuition' comes from the Latin word 'intueri', which is often roughly translated as meaning 'to look inside'’ or 'to contemplate'." Intuition provides us with beliefs that we cannot necessarily justify...

is based on the behavior of polynomial

Polynomial

In mathematics, a polynomial is an expression of finite length constructed from variables and constants, using only the operations of addition, subtraction, multiplication, and non-negative integer exponents...

functions. Assume that function f has a maximum at x₀, the reasoning being similar for a function minimum. If

is a local maximum then, roughly, there is a (possibly small) neighborhood

Neighbourhood (mathematics)

In topology and related areas of mathematics, a neighbourhood is one of the basic concepts in a topological space. Intuitively speaking, a neighbourhood of a point is a set containing the point where you can move that point some amount without leaving the set.This concept is closely related to the...

such as the function "is increasing before" and "decreasing after"This intuition is only correct for continuously differentiable (

) functions, while in general it is not literally correct – a function need not be increasing up to a local maximum: it may instead be oscillating, so neither increasing nor decreasing, but simply the local maximum is greater than any values in a small neighborhood to the left or right of it. See details in the pathologies.

. As the derivative is positive for an increasing function and negative for a decreasing function,

is positive before and negative after

doesn't skip values (by Darboux's theorem

Darboux's theorem (analysis)

Darboux's theorem is a theorem in real analysis, named after Jean Gaston Darboux. It states that all functions that result from the differentiation of other functions have the intermediate value property: the image of an interval is also an interval....

), so it has to be zero at some point between the positive and negative values. The only point in the neighbourhood where it is possible to have

.

The theorem (and its proof below) is more general than the intuition in that it doesn't require the function to be differentiable over a neighbourhood around

. It is sufficient for the function to be differentiable only in the extreme point.

Proof 1: Non-vanishing derivatives implies not extremum

Suppose that f is differentiable at

with derivative K, and assume without loss of generality

Without loss of generality

Without loss of generality is a frequently used expression in mathematics...

that

so the tangent line at

has positive slope (is increasing). Then there is a neighborhood of

on which the secant lines through

all have positive slope, and thus to the right of

f is greater, and to the left of

f is lesser.

The schematic of the proof is:

an infinitesimal statement about derivative (tangent line) at implies
a local statement about difference quotients (secant lines) near which implies
a local statement about the value of f near

Formally, by the definition of derivative,

means that

In particular, for sufficiently small

(less than some

), the fraction must be at least

by the definition of limit. Thus on the interval

one has:

one has replaced the equality in the limit (an infinitesimal statement) with an inequality on a neighborhood (a local statement). Thus, rearranging the equation, if

then:

so on the interval to the right, f is greater than

and if

then:

so on the interval to the left, f is less than

Thus

is not a local or global maximum or minimum of f.

Proof 2: Extremum implies derivative vanishes

Alternatively, one can start by assuming that

is a local maximum, and then prove that the derivative is 0.

Suppose that

is a local maximum (a similar proof applies if

is a local minimum). Then there

such that

and such that we have

with

. Hence for any

we notice that it holds

Since the limit

Limit of a function

In mathematics, the limit of a function is a fundamental concept in calculus and analysis concerning the behavior of that function near a particular input....

of this ratio as

gets close to 0 from above exists and is equal to

we conclude that

. On the other hand for

we notice that

but again the limit as

gets close to 0 from below exists and is equal to

so we also have

.

Hence we conclude that

Higher dimensions

Exactly the same statement holds; however, the proof is slightly more complicated. The complication is that in 1 dimension, one can either move left or right from a point, while in higher dimensions, one can move in many directions. Thus, if the derivative does not vanish, one must argue that there is some direction in which the function increases – and thus in the opposite direction the function decreases. This is the only change to the proof or the analysis.

Applications

Fermat's theorem is central to the calculus method of determining maxima and minima

Maxima and minima

: in one dimension, one can find extrema by simply computing the stationary points (by computing the zeros of the derivative), the non-differentiable points, and the boundary points, and then investigating this set to determine the extrema.

One can do this either by evaluating the function at each point and taking the maximum, or by analyzing the derivatives further, using the first derivative test

First derivative test

In calculus, the first derivative test uses the first derivative of a function to determine whether a given critical point of a function is a local maximum, a local minimum, or neither.-Intuitive explanation:...

, the second derivative test

Second derivative test

In calculus, the second derivative test is a criterion often useful for determining whether a given stationary point of a function is a local maximum or a local minimum using the value of the second derivative at the point....

, or the higher-order derivative test.

In dimension above 1, one cannot use the first derivative test any longer, but the second derivative test and higher-order derivative test generalize.

Cautions

A subtle misconception that is often held in the context of Fermat's theorem is to assume that it makes a stronger statement about local behavior than it does. Notably, Fermat's theorem does not say that functions (monotonically) "increase up to" or "decrease down from" a local maximum. This is very similar to the misconception that a limit means "monotonically getting closer to a point".

For "well-behaved functions" (which here mean continuously differentiable), some intuitions hold, but in general functions may be ill-behaved, as illustrated below.

The moral is that derivatives determine infinitesimal behavior, and that continuous derivatives determine local behavior.

Continuously differentiable functions

If f is continuously differentiable (

) on a neighborhood of

then

does mean that f is increasing on a neighborhood of

as follows.

If

and

then
by continuity of the derivative, there is a neighborhood

on which

Then f is increasing on this interval, by the mean value theorem

Mean value theorem

In calculus, the mean value theorem states, roughly, that given an arc of a differentiable curve, there is at least one point on that arc at which the derivative of the curve is equal to the "average" derivative of the arc. Briefly, a suitable infinitesimal element of the arc is parallel to the...

: the slope of any secant line is at least

as it equals the slope of some tangent line.

However, in the general statement of Fermat's theorem, where one is only given that the derivative at

is positive, one can only conclude that secant lines through

will have positive slope, for secant lines between

and near enough points.

Conversely, if the derivative of f at a point is zero (

is a stationary point), one cannot in general conclude anything about the local behavior of f – it may increase to one side and decrease to the other (as in

), increase to both sides (as in

), decrease to both sides (as in

), or behave in more complicated ways, such as oscillating (as in

, as discussed below).

One can analyze the infinitesimal behavior via the second derivative test

Second derivative test

and higher-order derivative test, if the function is differentiable enough, and if the first non-vanishing derivative at

is a continuous function, one can then conclude local behavior (i.e., if

is the first non-vanishing derivative, and

is continuous, so

), then one can treat f as locally close to a polynomial of degree k, since it behaves approximately as

but if the kth derivative is not continuous, one cannot draw such conclusions, and it may behave rather differently.

Pathological functions

Consider the function

– it oscillates increasingly rapidly between

and

as x approaches 0. Consider then

– this oscillates increasingly rapidly between 0 and

as x approaches 0. If one extends this function by

then the function is continuous and everywhere differentiable (it is differentiable at 0 with derivative 0), but has rather unexpected behavior near 0: in any neighborhood of 0 it attains 0 infinitely many times, but also equals

(a positive number) infinitely often.

Continuing in this vein,

oscillates between

and

is a local and global minimum, but on no neighborhood of 0 is it decreasing down to or increasing up from 0 – it oscillates wildly near 0.

This pathology can be understood because, while the function is everywhere differentiable, it is not continuously differentiable: the limit of

does not exist, so the derivative is not continuous at 0. This reflects the oscillation between increasing and decreasing values as it approaches 0.