Extrema

Table of contents

Intro

How bad will the hurricane season be next year? No one knows for sure, but it doesn't stop us from guessing.

Weather contain such extraordinarily complicated phenomena, that predictions based on current conditions may not foresee catastrophic events until its too late to act.

Instead, extreme value theory may be our best shot at getting a good sense of future behavior far in advance. This branch of statistics looks at data recorded from previous events estimate the extreme values: the best and worst case scenarios.

The method is used for complex issues in the social and natural sciences the like, for instance to indicate financial crashes, or the damages due to earthquakes, before they have occurred.

Concept

What we will do in this chapter is to study interesting points of a functions. For most applications, the interesting points are where the function has a minimum or maximum.

These points are called extrema and it turns out that the gradient is zero at these points.

So we will actually use the same procedure we had in single-variable calculus. That is, we look for points that have a zero gradient.

Math

These are some important concept in extremum analysis:

A critical point is where the gradient is zero.
A singular point is where the gradient is not defined.
Extrema are either critical orsingular points and also include boundary points.
The Hessian matrix can be used to classify different type of extrema.

Extrema, or extreme points, in multivariable calculus are similar to those in single variable calculus, except they're even prettier. At least if we don't go beyond . If we add more dimensions, they start getting really hard to visualize at all, so we'll stay in two variables for most of this topic.

The theory extends to higher dimensions, and in some cases the techniques we'll show do as well. Unfortunately, though, investigating extrema turns out to be a bit more complicated in several variables than in one.

Types of extrema

There are three types of extrema. A point in the domain of is an extrema if and only if:

It satisfies . Then, we call it a critical point.
doesn't exist. Then, it's a singular point.
It is a boundary point of the domain of .

We will see more of all these three types in the coming lecture notes. And a small reminder: do not forget to check the boundary as you look for extrema. It's a classical but unnecessary mistake.

Local and global max and min

As we classify critical points, we talk about local and global min and max points, like in single variable calculus.

The function has a local maximum at in the domain of , if for all points in the domain that are sufficiently close to . If the inequality holds for all the points in the domain of , then it's a global maximum.

The graph below has a global maximum at the point , and a local maximum at :

Local and global minima are defined in the same way, just switching to in the definition.

Conditions for existence

Do all functions have extrema? No. There are a few requirements which need to be met.

Sufficient conditions for the existence of a maximum and minimum value of a function are:

is a continuous function of variables.
The domain of is a closed and bounded set.

Then, the range of will be a bounded set as well, and there exist points in the domain for which will have global max and min values.

These requirements are quite natural. Try to come up with counter-examples for each one of them, where the global max or min would be undefined.

A hint for the first one: what happens around if ?

Singular points

Singular points are points where the gradient is not defined. These can be points where is discontinuous, like here:

This is the function . In the case of discontinuous functions, it may, as we said in the last note, be that the function does not have a global max or min.

Specifically, the function above goes to infinity as goes to . So there exists no global maximum.

Singular points can also be points where is not differentiable. One example is the at bottom of the cone made up by

As we cannot take the partial derivatives there, we need to check the function value at this point separately to determine if it's a local or global max or min.

Critical points

For a function , the critical points come in three versions: maxima, minima and saddle points. A point is a critical point if:

This means both partial derivatives must be zero at .

At a critical point, all partial derivatives are zero

At a critical point, the tangent plane is horizontal. This is true for maxima:

... and for minima:

It also goes for saddle points. This type of critical point is somewhat more loosely defined; basically, all that is neither maximum nor minimum but still has a zero-gradient is a saddle point.

Below, the typical saddle point is displayed on the right, together with a more dubious fellow on the left, which actually has an endless line of saddle points.

Classification 1: brute force

If we want to know where the critical points of are, we are often not satisfied before we've also determined which type of critical point we've found. After all, there is a big difference in knowing we have a mountain to climb or a valley to stroll down into.

There are different ways of doing the classification. One is to proceed as follows: knowing that the function has a critical point at , study the function difference:

As we take a small step away from the point, does the function increase or decrease?

If is positive for all , the function increases in all directions around the point, and we're dealing with a minimum. If is negative, is decreasing in all directions, and the point must be a maximum. If the function is decreasing in some directions and increasing in others, we're standing face to face with a saddle point.

This method is brute force, and if the function is not nice, it may be hard to conclude without doubt that what we've found isn't a saddle point.

Example

Classify the critical point in of the function

We begin by finding

since can be both negative or positive for different values of and , is a saddle point.

Classification 2: second derivative test

If not tempted by the idea of analysing the function behaviour in all directions around some point, our saviour is the second derivative test.

Like in single variable calculus, the sign of the second derivative of a function tells what concavity the function has. However, as there are multiple second derivatives, the concavity is harder to read.

In two variables, the second derivative test gives us the answer. For understanding it, and calculating the concavity in higher dimensions, we'll need some serious linear algebra to help us out with the business.

Today, we'll content ourselves with stating the formula for the aforementioned second derivative test. Given all the second partial derivatives of a function , plug them into the following formula:

Now, say that has a critical point at . Calculate . If

, then has a saddle point at .
, then the test gives no information.
and , then has a local minimum at .
and , then has a local maximum at .

Hessian matrix

The following lecture note relies heavily on concepts from linear algebra, so feel free to skip it if you are not comfortable with it. That being said, we will here build a bridge between calculus and linear algebra which can be useful both practically and to aid your understanding.

The Hessian matrix

Recall the second derivative test from the previous lecture note. Here, we examined the sign of the following function in order to classify critical points:

There is no coincidence that we use the letter to denote this function. The reason is that it stands for a determinant. In particular, the one of a matrix called the Hessian matrix.

The Hessian matrix contains information about a function's concavity, and it helps us classify critical points

The determinant for a -matrix is given by:

Now let us see what happens as we insert the second partial derivatives of the function as the elements in a matrix this way:

What we get is the Hessian matrix of a function in two variables. Assuming that all second partial derivatives are continuous, we have that

so that the determinant becomes

Hence, what we used as our function to classify critical points of a function at a point according to the second derivative test is the determinant of the Hessian matrix at .

Now the reason this technique works is because the Hessian matrix, with its elements being all second partial derivatives, contains all the necessary information for the task, which is the function's concavity in the various directions.

This can be useful not only to classify critical points, but to examine the concavity at any other point as well. For example when sketching a surface in 3D.

Classifying critical points in any dimension

The Hessian matrix exists for twice differentiable functions of any number of variables, with it looks as follows:

Now the Hessian matrix can be used to classify critical points in any dimension, but it is not quite as simple as to look at the determinant. Instead, we have the following:

If the point is a critical point of the function with all second partial derivatives being continuous around , then is a:

(a) local max if is positive definite
(b) local min if is negative definite
(c) saddle point if is indefinite

Recall from linear algebra that a matrix is

positive definite if all of the eigenvalues of are positive.
negative definite if all of the eigenvalues of are negative.
indefinite if there are both negative and positive eigenvalues.

Table of contents

Enjoy this topic? Please help us and share it.

Twitter

Facebook

Differentials

Optimization Next topic image

Extrema

Intro

Concept

Math

Extrema

Types of extrema

Local and global max and min

Conditions for existence

Singular points

Critical points

Classification 1: brute force

Example

Classification 2: second derivative test

Hessian matrix

The Hessian matrix

Classifying critical points in any dimension

A math app that helps you succeed