Intro
Quadratic forms are used in support vector machines, a class of methods within machine learning used to distinguish one category of data from another.
They've given rise to an entire programming field: quadratic programming. These methods have been used to classify cancer types and proteins.
So although quadratic forms may seem kinda random, they've got many real-world applications!
Concept
When introduced to functions in one variable, the starting point is the equation of a line:
where the parameters and determines the slope, and the intercept to the -axis, respectively.
Not until the concepts of linear equations have been thoroughly understood is a student ready to go on to the next step, the quadratic equation:
The fact that one term contains the independent variable raised to the second power leads to a function whose behavior is a bit more complex.
The same is true when we move from linear systems to ones of quadratic form, where components of vectors are allowed to be multiplied by themselves or other components of the vector.
Math
The general quadratic form in contains all possible terms multiplied by some factor, where . For example, in the general quadratic form is:
where we have combined the two terms under the same factor .
We can then express the quadratic form as a multiplication with a symmetric matrix :
We say that a function of this type, in any dimension , is the quadratic form associated with , denoted by , where:
Quadratic form
Introduction
A course in linear algebra more or less exhausts the following expressions
in the study of linear equations and linear systems. We have been only considering variables to their first power, meaning we do not have any variables powered by two or higher. As opposed to linear forms, we are now considering quadratic forms;
where are all the possible cross product terms in which and are distinct. Consider the following examples for and respectively for a clarification of the cross product terms;
A general quadratic form on looks like
while the quadratic form on looks like
Both examples can be rewritten into a matrix form as;
Note that the matrices used above are symmetric, with their diagonal elements corresponding to the coefficients of the squared terms. We call the function
the quadratic form associated with , that also can be expressed in dot product notation as;
Three problems
Speaking of quadratic forms, there are three important problems, or questions, that one stumbles across about them. When we study the quadratic function
we consider the following three questions;
Does assume only positive values, only negative values, or both? given that
What kind of curve, or surface, are we dealing with? given that is defined in , or in
What are the maximum and minimum values for on ? given that is constrained to satisfy
The first two questions are addressed here, while the third is considered part of the mathematical branch Optimization Theory, and is therefore not dealt with in this section. We start by sharing a practical approach to question number one, followed by introducing the names of the surfaces and linking them to the properties of the matrix .
Principle axes theorem
The principle axes theorem comes to use for answering the following problem that is connected to quadratic functions;
Does the quadratic form assume only positive values, only negative values, or both?
The question is cumbersome to digest when dealing with a quadratic function due to its cross product terms. Hence, it is much more easily answered when there are no cross product terms to consider. We start simple!
Intro without cross product terms
First, let's revisit the quadratic form in without any cross product terms;
Note that we name the matrix for this quadratic form , since the matrix is diagonal whenever there are no cross product terms. Like this;
Since all variables are squared, making all negative components of positive, the sign of is determined by the sign of the coefficients. We have the following three cases for the sum ;
only assumes positive values if, and only if, all the coefficients are positive.
only assumes negative values if, and only if all the coefficients are negative.
assumes both positive and negative values if, only if the coefficients assume both positive and negative values.
This is all well and good, but when considering the general quadratic form in ,
we encounter cross product terms, and they make it complicated to deduce what values assume. However, what if we could transfer into without cross product terms? wouldn't that be nice? The key is that matrix is symmetric, meaning that is orthogonal diagonalizable. Perhaps this ignites a spark?
Getting rid of cross product terms
First, a recap!
Recall that all symmetric matrices are orthogonal diagonalizable. It follows that there is an orthogonal matrix of eigenvectors, and a diagonal matrix of eigenvalues, so we can rewrite into;
where since is an orthogonal matrix.
Now, our approach is to deduce what values assumes by the following line of thought of getting rid of the cross product terms;
Do a variable substitution so that
we can utilize the orthogonal diagonalization
to get rid of the cross product terms.
Easy enough right? let's do it! We have;
Let's do a variable substitution where is an orthogonal matrix with eigenvectors of
and now we've gotten rid of the cross product terms of
We can now easily deduce the values of by inspection of (without cross product terms) and applying our aforementioned three cases for the sum of . This result constitutes of the following theorem;
Principle Axes Theorem
Let
be a quadratic function, where is a symmetric matrix. Then there exists a change of variable
that transforms into the quadratic function that has no cross product terms. is an orthogonal matrix that orthogonally diagonalizes . Making the variable substitution of yields the quadratic form
where are eigenvalues of corresponding to the same successive order of the columns of , which are the eigenvectors of .
Definite matrix
Speaking of a quadratic functions, one is asked whether it assumes only positive values, only negative values, or both. To determine the case of a specific quadratic function we get rid of the cross-product terms by applying the principle axes theorem to transform it into the quadratic function . Then we easily deduce the current case for the given by inspection of . Each case is naturally associated with a name, and the names are positive definite, negative definite and indefinite respectively.
Let's have that is a quadratic function and we transform it into
where the eigenvalues of matrix tells the story what values assumes. We have that;
only assumes positive values if, and only if, all the eigenvalues are positive.
only assumes negative values if, and only if all the eigenvalues are negative.
assumes both positive and negative values if, only if the eigenvalues assume both positive and negative values.
Note that we in the above three cases refer to the coefficients as eigenvalues of , since that's what the principle axes theorem provides us with. Now to what we call the function (and its corresponding matrix , for that matter) for each of the cases;
and are said to be positive definite if all the eigenvalues of are positive.
and are said to be negative definite if all the eigenvalues of are negative.
and are said to be indefinite if the eigenvalues of assume both positive and negative values.
However, two more cases, positive semidefinite and negative semidefinite, are required to complete all possible outcomes, leading to a total of five cases. We summarize all five here, with corresponding requirements on the eigenvalues of .
for is said to be positive definite and happens only when all the eigenvalues are positive, i.e. .
for is said to be positive semidefinite and happens only when all the eigenvalues are nonnegative, i.e.
for is said to be negative definite and happens only when all the eigenvalues are negative, i.e.
for is said to be negative semidefinite and happens only when all the eigenvalues are nonpositive, i.e.
that assumes both positive and negative values is said to be indefinite and happens only when the eigenvalues assume both positive and negative values.
Surfaces and curves
The second question that comes to mind when speaking about quadratic functions is graphical and therefore connected to the case of or . The question is
What kind of curve, or surface, are we dealing with?
To answer the question we need to introduce the notion of a conic section, sometimes referred to simply as conic. The conic section is result of cutting a double-napped cone with a plane. The most important conic sections are illustrated below: circle (top left), ellipse (top right), parabola (bottom left) and hyperbola (bottom right).
We are not discussing the analytical geometry here, but giving a taste of how the quadratic form is related to the conic sections.
So let's say we have a quadratic equation in to solve, as in
where is a constant. If the equation is the equation of a conic, and if , we can easily divide both sides with to achieve
where
By rotating the coordinate axes to eliminate the possible cross product terms (according to principle axes theorem), we have reduced the original equation into
where and are eigenvalues to the matrix , and they determine what kind of conic is represented by this equation by the following three cases if is a matrix;
represents an ellipse if, and only if, both eigenvalues are positive, i.e. is positive definite
has no graph if, and only if, both eigenvalues are negative, i.e. is negative definite
represents an hyperbola if, and only if, the eigenvalues are both positive and negative, i.e. is indefinite.
In the case of the ellipse, the equation above can be rewritten as
which may be recognized by the reader as an ellipse with axes lengths of
respectively. The special case of a circle occurs when