Fermat's theorem (stationary points)
Encyclopedia
In mathematics, Fermat's theorem (not to be confused with Fermat's last theorem
) is a method to find local maxima and minima of differentiable function
s on open sets by showing that every local extremum
of the function is a stationary point
(the function derivative
is zero in that point). Fermat's theorem is a theorem
in real analysis
, named after Pierre de Fermat
.
By using Fermat's theorem, the potential extrema of a function , with derivative , are found by solving an equation
in . Fermat's theorem gives only a necessary condition for extreme function values, and some stationary points are inflection point
s (not a maximum or minimum). The function's second derivative
, if it exists, can determine if any stationary point is a maximum, minimum, or inflection point.
and suppose that is a local extremum of . If is differentiable
at then .
Another way to understand the theorem is via the contrapositive statement:
Exactly the same statement is true in higher dimensions, with the proof requiring only slight generalization.
, global extrema of a function f on a domain A occur only at boundaries
, non-differentiable points, and stationary points.
If is a global extremum of f, then one of the following is true:
or more precisely, Thus, from the perspective that "if f is differentiable and has non-vanishing derivative at then it does not attain an extremum at " the intuition is that if the derivative at is positive, the function is increasing near while if the derivative is negative, the function is decreasing near In both cases, it cannot attain a maximum or minimum, because its value is changing. It can only attain a maximum or minimum if it "stops" – if the derivative vanishes (or if it is not differentiable, or if one runs into the boundary and cannot continue). However, making "behaves like a linear function" precise requires careful analytic proof.
More precisely, the intuition can be stated as: if the derivative is positive, there is some point to the right of where f is greater, and some point to the left of where f is less, and thus f attains neither a maximum nor a minimum at Conversely, if the derivative is negative, there is a point to the right which is lesser, and a point to the left which is greater. Stated this way, the proof is just translating this into equations and verifying "how much greater or less".
The intuition
is based on the behavior of polynomial
functions. Assume that function f has a maximum at x0, the reasoning being similar for a function minimum. If is a local maximum then, roughly, there is a (possibly small) neighborhood
of such as the function "is increasing before" and "decreasing after"This intuition is only correct for continuously differentiable () functions, while in general it is not literally correct – a function need not be increasing up to a local maximum: it may instead be oscillating, so neither increasing nor decreasing, but simply the local maximum is greater than any values in a small neighborhood to the left or right of it. See details in the pathologies. . As the derivative is positive for an increasing function and negative for a decreasing function, is positive before and negative after . doesn't skip values (by Darboux's theorem
), so it has to be zero at some point between the positive and negative values. The only point in the neighbourhood where it is possible to have is .
The theorem (and its proof below) is more general than the intuition in that it doesn't require the function to be differentiable over a neighbourhood around . It is sufficient for the function to be differentiable only in the extreme point.
that so the tangent line at has positive slope (is increasing). Then there is a neighborhood of on which the secant lines through all have positive slope, and thus to the right of f is greater, and to the left of f is lesser.
The schematic of the proof is:
Formally, by the definition of derivative, means that
In particular, for sufficiently small (less than some ), the fraction must be at least by the definition of limit. Thus on the interval one has:
one has replaced the equality in the limit (an infinitesimal statement) with an inequality on a neighborhood (a local statement). Thus, rearranging the equation, if then:
so on the interval to the right, f is greater than and if then:
so on the interval to the left, f is less than
Thus is not a local or global maximum or minimum of f.
Suppose that is a local maximum (a similar proof applies if is a local minimum). Then there such that and such that we have with . Hence for any we notice that it holds
Since the limit
of this ratio as gets close to 0 from above exists and is equal to we conclude that . On the other hand for we notice that
but again the limit as gets close to 0 from below exists and is equal to so we also have .
Hence we conclude that
: in one dimension, one can find extrema by simply computing the stationary points (by computing the zeros of the derivative), the non-differentiable points, and the boundary points, and then investigating this set to determine the extrema.
One can do this either by evaluating the function at each point and taking the maximum, or by analyzing the derivatives further, using the first derivative test
, the second derivative test
, or the higher-order derivative test.
In dimension above 1, one cannot use the first derivative test any longer, but the second derivative test and higher-order derivative test generalize.
For "well-behaved functions" (which here mean continuously differentiable), some intuitions hold, but in general functions may be ill-behaved, as illustrated below.
The moral is that derivatives determine infinitesimal behavior, and that continuous derivatives determine local behavior.
If and then
by continuity of the derivative, there is a neighborhood of on which Then f is increasing on this interval, by the mean value theorem
: the slope of any secant line is at least as it equals the slope of some tangent line.
However, in the general statement of Fermat's theorem, where one is only given that the derivative at is positive, one can only conclude that secant lines through will have positive slope, for secant lines between and near enough points.
Conversely, if the derivative of f at a point is zero ( is a stationary point), one cannot in general conclude anything about the local behavior of f – it may increase to one side and decrease to the other (as in ), increase to both sides (as in ), decrease to both sides (as in ), or behave in more complicated ways, such as oscillating (as in , as discussed below).
One can analyze the infinitesimal behavior via the second derivative test
and higher-order derivative test, if the function is differentiable enough, and if the first non-vanishing derivative at is a continuous function, one can then conclude local behavior (i.e., if is the first non-vanishing derivative, and is continuous, so ), then one can treat f as locally close to a polynomial of degree k, since it behaves approximately as but if the kth derivative is not continuous, one cannot draw such conclusions, and it may behave rather differently.
Continuing in this vein, oscillates between and and is a local and global minimum, but on no neighborhood of 0 is it decreasing down to or increasing up from 0 – it oscillates wildly near 0.
This pathology can be understood because, while the function is everywhere differentiable, it is not continuously differentiable: the limit of as does not exist, so the derivative is not continuous at 0. This reflects the oscillation between increasing and decreasing values as it approaches 0.
Fermat's Last Theorem
In number theory, Fermat's Last Theorem states that no three positive integers a, b, and c can satisfy the equation an + bn = cn for any integer value of n greater than two....
) is a method to find local maxima and minima of differentiable function
Differentiable function
In calculus , a differentiable function is a function whose derivative exists at each point in its domain. The graph of a differentiable function must have a non-vertical tangent line at each point in its domain...
s on open sets by showing that every local extremum
Maxima and minima
In mathematics, the maximum and minimum of a function, known collectively as extrema , are the largest and smallest value that the function takes at a point either within a given neighborhood or on the function domain in its entirety .More generally, the...
of the function is a stationary point
Stationary point
In mathematics, particularly in calculus, a stationary point is an input to a function where the derivative is zero : where the function "stops" increasing or decreasing ....
(the function derivative
Derivative
In calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much one quantity is changing in response to changes in some other quantity; for example, the derivative of the position of a...
is zero in that point). Fermat's theorem is a theorem
Theorem
In mathematics, a theorem is a statement that has been proven on the basis of previously established statements, such as other theorems, and previously accepted statements, such as axioms...
in real analysis
Real analysis
Real analysis, is a branch of mathematical analysis dealing with the set of real numbers and functions of a real variable. In particular, it deals with the analytic properties of real functions and sequences, including convergence and limits of sequences of real numbers, the calculus of the real...
, named after Pierre de Fermat
Pierre de Fermat
Pierre de Fermat was a French lawyer at the Parlement of Toulouse, France, and an amateur mathematician who is given credit for early developments that led to infinitesimal calculus, including his adequality...
.
By using Fermat's theorem, the potential extrema of a function , with derivative , are found by solving an equation
Equation
An equation is a mathematical statement that asserts the equality of two expressions. In modern notation, this is written by placing the expressions on either side of an equals sign , for examplex + 3 = 5\,asserts that x+3 is equal to 5...
in . Fermat's theorem gives only a necessary condition for extreme function values, and some stationary points are inflection point
Inflection point
In differential calculus, an inflection point, point of inflection, or inflection is a point on a curve at which the curvature or concavity changes sign. The curve changes from being concave upwards to concave downwards , or vice versa...
s (not a maximum or minimum). The function's second derivative
Second derivative
In calculus, the second derivative of a function ƒ is the derivative of the derivative of ƒ. Roughly speaking, the second derivative measures how the rate of change of a quantity is itself changing; for example, the second derivative of the position of a vehicle with respect to time is...
, if it exists, can determine if any stationary point is a maximum, minimum, or inflection point.
Fermat's theorem
Let be a functionFunction (mathematics)
In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...
and suppose that is a local extremum of . If is differentiable
Differentiable function
In calculus , a differentiable function is a function whose derivative exists at each point in its domain. The graph of a differentiable function must have a non-vertical tangent line at each point in its domain...
at then .
Another way to understand the theorem is via the contrapositive statement:
- If is differentiableDifferentiable functionIn calculus , a differentiable function is a function whose derivative exists at each point in its domain. The graph of a differentiable function must have a non-vertical tangent line at each point in its domain...
at , and - ,
- then is not an extremum of f.
Exactly the same statement is true in higher dimensions, with the proof requiring only slight generalization.
Application to optimization
As a corollaryCorollary
A corollary is a statement that follows readily from a previous statement.In mathematics a corollary typically follows a theorem. The use of the term corollary, rather than proposition or theorem, is intrinsically subjective...
, global extrema of a function f on a domain A occur only at boundaries
Bounded function
In mathematics, a function f defined on some set X with real or complex values is called bounded, if the set of its values is bounded. In other words, there exists a real number M...
, non-differentiable points, and stationary points.
If is a global extremum of f, then one of the following is true:
- boundary: is in the boundary of A
- non-differentiable: f is not differentiable at
- stationary point: is a stationary point of f
Intuition
Intuitively, a differentiable function is approximated by its derivative – a differentiable function behaves infinitesimally like a linear functionLinear function
In mathematics, the term linear function can refer to either of two different but related concepts:* a first-degree polynomial function of one variable;* a map between two vector spaces that preserves vector addition and scalar multiplication....
or more precisely, Thus, from the perspective that "if f is differentiable and has non-vanishing derivative at then it does not attain an extremum at " the intuition is that if the derivative at is positive, the function is increasing near while if the derivative is negative, the function is decreasing near In both cases, it cannot attain a maximum or minimum, because its value is changing. It can only attain a maximum or minimum if it "stops" – if the derivative vanishes (or if it is not differentiable, or if one runs into the boundary and cannot continue). However, making "behaves like a linear function" precise requires careful analytic proof.
More precisely, the intuition can be stated as: if the derivative is positive, there is some point to the right of where f is greater, and some point to the left of where f is less, and thus f attains neither a maximum nor a minimum at Conversely, if the derivative is negative, there is a point to the right which is lesser, and a point to the left which is greater. Stated this way, the proof is just translating this into equations and verifying "how much greater or less".
The intuition
Intuition (knowledge)
Intuition is the ability to acquire knowledge without inference or the use of reason. "The word 'intuition' comes from the Latin word 'intueri', which is often roughly translated as meaning 'to look inside'’ or 'to contemplate'." Intuition provides us with beliefs that we cannot necessarily justify...
is based on the behavior of polynomial
Polynomial
In mathematics, a polynomial is an expression of finite length constructed from variables and constants, using only the operations of addition, subtraction, multiplication, and non-negative integer exponents...
functions. Assume that function f has a maximum at x0, the reasoning being similar for a function minimum. If is a local maximum then, roughly, there is a (possibly small) neighborhood
Neighbourhood (mathematics)
In topology and related areas of mathematics, a neighbourhood is one of the basic concepts in a topological space. Intuitively speaking, a neighbourhood of a point is a set containing the point where you can move that point some amount without leaving the set.This concept is closely related to the...
of such as the function "is increasing before" and "decreasing after"This intuition is only correct for continuously differentiable () functions, while in general it is not literally correct – a function need not be increasing up to a local maximum: it may instead be oscillating, so neither increasing nor decreasing, but simply the local maximum is greater than any values in a small neighborhood to the left or right of it. See details in the pathologies. . As the derivative is positive for an increasing function and negative for a decreasing function, is positive before and negative after . doesn't skip values (by Darboux's theorem
Darboux's theorem (analysis)
Darboux's theorem is a theorem in real analysis, named after Jean Gaston Darboux. It states that all functions that result from the differentiation of other functions have the intermediate value property: the image of an interval is also an interval....
), so it has to be zero at some point between the positive and negative values. The only point in the neighbourhood where it is possible to have is .
The theorem (and its proof below) is more general than the intuition in that it doesn't require the function to be differentiable over a neighbourhood around . It is sufficient for the function to be differentiable only in the extreme point.
Proof 1: Non-vanishing derivatives implies not extremum
Suppose that f is differentiable at with derivative K, and assume without loss of generalityWithout loss of generality
Without loss of generality is a frequently used expression in mathematics...
that so the tangent line at has positive slope (is increasing). Then there is a neighborhood of on which the secant lines through all have positive slope, and thus to the right of f is greater, and to the left of f is lesser.
The schematic of the proof is:
- an infinitesimal statement about derivative (tangent line) at implies
- a local statement about difference quotients (secant lines) near which implies
- a local statement about the value of f near
Formally, by the definition of derivative, means that
In particular, for sufficiently small (less than some ), the fraction must be at least by the definition of limit. Thus on the interval one has:
one has replaced the equality in the limit (an infinitesimal statement) with an inequality on a neighborhood (a local statement). Thus, rearranging the equation, if then:
so on the interval to the right, f is greater than and if then:
so on the interval to the left, f is less than
Thus is not a local or global maximum or minimum of f.
Proof 2: Extremum implies derivative vanishes
Alternatively, one can start by assuming that is a local maximum, and then prove that the derivative is 0.Suppose that is a local maximum (a similar proof applies if is a local minimum). Then there such that and such that we have with . Hence for any we notice that it holds
Since the limit
Limit of a function
In mathematics, the limit of a function is a fundamental concept in calculus and analysis concerning the behavior of that function near a particular input....
of this ratio as gets close to 0 from above exists and is equal to we conclude that . On the other hand for we notice that
but again the limit as gets close to 0 from below exists and is equal to so we also have .
Hence we conclude that
Higher dimensions
Exactly the same statement holds; however, the proof is slightly more complicated. The complication is that in 1 dimension, one can either move left or right from a point, while in higher dimensions, one can move in many directions. Thus, if the derivative does not vanish, one must argue that there is some direction in which the function increases – and thus in the opposite direction the function decreases. This is the only change to the proof or the analysis.Applications
Fermat's theorem is central to the calculus method of determining maxima and minimaMaxima and minima
In mathematics, the maximum and minimum of a function, known collectively as extrema , are the largest and smallest value that the function takes at a point either within a given neighborhood or on the function domain in its entirety .More generally, the...
: in one dimension, one can find extrema by simply computing the stationary points (by computing the zeros of the derivative), the non-differentiable points, and the boundary points, and then investigating this set to determine the extrema.
One can do this either by evaluating the function at each point and taking the maximum, or by analyzing the derivatives further, using the first derivative test
First derivative test
In calculus, the first derivative test uses the first derivative of a function to determine whether a given critical point of a function is a local maximum, a local minimum, or neither.-Intuitive explanation:...
, the second derivative test
Second derivative test
In calculus, the second derivative test is a criterion often useful for determining whether a given stationary point of a function is a local maximum or a local minimum using the value of the second derivative at the point....
, or the higher-order derivative test.
In dimension above 1, one cannot use the first derivative test any longer, but the second derivative test and higher-order derivative test generalize.
Cautions
A subtle misconception that is often held in the context of Fermat's theorem is to assume that it makes a stronger statement about local behavior than it does. Notably, Fermat's theorem does not say that functions (monotonically) "increase up to" or "decrease down from" a local maximum. This is very similar to the misconception that a limit means "monotonically getting closer to a point".For "well-behaved functions" (which here mean continuously differentiable), some intuitions hold, but in general functions may be ill-behaved, as illustrated below.
The moral is that derivatives determine infinitesimal behavior, and that continuous derivatives determine local behavior.
Continuously differentiable functions
If f is continuously differentiable () on a neighborhood of then does mean that f is increasing on a neighborhood of as follows.If and then
by continuity of the derivative, there is a neighborhood of on which Then f is increasing on this interval, by the mean value theorem
Mean value theorem
In calculus, the mean value theorem states, roughly, that given an arc of a differentiable curve, there is at least one point on that arc at which the derivative of the curve is equal to the "average" derivative of the arc. Briefly, a suitable infinitesimal element of the arc is parallel to the...
: the slope of any secant line is at least as it equals the slope of some tangent line.
However, in the general statement of Fermat's theorem, where one is only given that the derivative at is positive, one can only conclude that secant lines through will have positive slope, for secant lines between and near enough points.
Conversely, if the derivative of f at a point is zero ( is a stationary point), one cannot in general conclude anything about the local behavior of f – it may increase to one side and decrease to the other (as in ), increase to both sides (as in ), decrease to both sides (as in ), or behave in more complicated ways, such as oscillating (as in , as discussed below).
One can analyze the infinitesimal behavior via the second derivative test
Second derivative test
In calculus, the second derivative test is a criterion often useful for determining whether a given stationary point of a function is a local maximum or a local minimum using the value of the second derivative at the point....
and higher-order derivative test, if the function is differentiable enough, and if the first non-vanishing derivative at is a continuous function, one can then conclude local behavior (i.e., if is the first non-vanishing derivative, and is continuous, so ), then one can treat f as locally close to a polynomial of degree k, since it behaves approximately as but if the kth derivative is not continuous, one cannot draw such conclusions, and it may behave rather differently.
Pathological functions
Consider the function – it oscillates increasingly rapidly between and as x approaches 0. Consider then – this oscillates increasingly rapidly between 0 and as x approaches 0. If one extends this function by then the function is continuous and everywhere differentiable (it is differentiable at 0 with derivative 0), but has rather unexpected behavior near 0: in any neighborhood of 0 it attains 0 infinitely many times, but also equals (a positive number) infinitely often.Continuing in this vein, oscillates between and and is a local and global minimum, but on no neighborhood of 0 is it decreasing down to or increasing up from 0 – it oscillates wildly near 0.
This pathology can be understood because, while the function is everywhere differentiable, it is not continuously differentiable: the limit of as does not exist, so the derivative is not continuous at 0. This reflects the oscillation between increasing and decreasing values as it approaches 0.
See also
- Optimization (mathematics)Optimization (mathematics)In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
- Maxima and minimaMaxima and minimaIn mathematics, the maximum and minimum of a function, known collectively as extrema , are the largest and smallest value that the function takes at a point either within a given neighborhood or on the function domain in its entirety .More generally, the...
- DerivativeDerivativeIn calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much one quantity is changing in response to changes in some other quantity; for example, the derivative of the position of a...
- Extreme value
- arg max