How are eigenvectors used in practice?
Face detection systems are used to highlight the differences between peoples faces so that only the right person is granted access. Such programs rely heavily on the concept of eigenvectors.
It turns out that human faces are linearly dependent on combinations of certain distinguishing features, such as hair color, nose size, distance between eyes, and so on.
To accurately construct these features, the system needs a lot of pictures of people to learn from, but after it has outlined the important aspects of faces, a much smaller set of special images can be used to reconstruct and compare any of the people.
These special images are called eigenfaces. The name comes from the fact that they are essentially eigenvectors of a matrix containing information about the facial features present among the set of given pictures.
The corresponding eigenvalues gives a measure of how important each respective eigenface is in order to distinguish between different people.
What does an eigenvector mean?
Eigenvectors are associated with a certain square matrix and does not change direction when multiplied with that matrix. However, the vectors length may change, and it's scaled by its corresponding eigenvalue .
An eigenvector is a vector that doesn't change direction when multiplied by a certain matrix, and its eigenvalue is the scalar that decides its change in length.
Any vector along the line is considered an eigenvector to the square matrix . In fact, this line is a subspace to and could have a higher dimension, and its called the eigenspace of the corresponding eigenvalue to the matrix .
What is the mathematical definition of an eigenvector?
By multiplying a vector with a matrix produces another vector. For certain matching matrices and vectors, the two vectors are the same:
This is one example of an eigenvector. The definition of an eigenvector is that the vector cannot change direction when multiplied by . However, its length can change, so we modify the notion above by introducing the scalar :
This equation states the eigenvalue problem, where we aim to find the sets of vectors and scalars satisfying it. To solve this problem, we rewrite the equation as:
For non-trivial solutions, the determinant of must be zero, so that we can find the eigenvalues by solving the following equation for :
With the eigenvalues known, finding the corresponding set of eigenvectors is a piece of cake.
What is an eigenvector?
Eigenvectors and eigenvalues are usually something of a threshold for the beginner to get over.
This is a shame because it is not the most complicated section in linear algebra, but rather it is in this section that the shortcomings of the beginner's newfound knowledge are discovered.
If the beginner has gained a good basic understanding from previous sections, it will make for a pleasant experience.
Let's start with the definition of a fixed point.
Let be an -matrix and be an -dimensional vector. A fixed point to is any value of that satisfies the condition:
Note that the fixed point is related to the matrix, according to the definition: "A fixed point to ...".
Each matrix has at least one fixed point, namely , which is called the trivial fixed point. The way forward to find all the fix points of is:
whereupon the last line is a homogeneous linear system of equations, something that should be easy for the beginner to solve at this point.
Remember the three outcomes: a unique solution, infinitely many solutions and no solution. If we put this together with the insights around the determinant, this leads to the following theorem:
Let be an -matrix. Then the following three statements are always true:
has non-trivial fixed points
Definition of eigenvector
Now we are ready for the definition of an eigenvector, which is a relaxation of the definition of a fixed point (relaxation here means that the definition becomes less strict and more open-minded).
Let be an -matrix, be an -dimensional vector and be a scalar. For each and that meet the requirement:
is said to be the eigenvector, and the corresponding is said to be its eigenvalue.
In short, you can translate the above to:
Eigenvector is the vector which, after multiplication with the matrix , retains its direction. The eigenvalue is the scale factor that affects the length and orientation of the eigenvector after multiplication.
What is an eigenvalue?
An eigenvalue of an -matrix is the scale factor associated with each eigenvector, or eigenvectors, and satisfies the following:
A practical description is that an eigenvector is a vector that does not change direction when multiplied by the matrix , and an eigenvalue is its scale factor that adjusts its length.
For a deeper introduction, the beginner is referred to read about eigenvectors. The following theorem is useful to understand:
If is an -matrix and is a scalar, then the following statements are always true:
is a solution to the equation.
is an eigenvalue of .
the linear system has non-trivial solutions.
What is a characteristic equation?
To find the eigenvectors and eigenvalues of an -matrix , the following is done:
The last row has an unknown embedded in the matrix , namely , which is why our usual solution method of Gauss-Jordan becomes tricky.
We therefore take to our insights about the determinant - we are looking for non-trivial solutions to the above system, which requires that the determinant of is 0.
If non-zero, it would mean that the system has a unique solution, which would then be just the trivial solution . Therefore, we choose to assume that there are non-trivial solutions, and thus the determinant should be 0.
The above equation is called the characteristic equation, and we can easily solve it for values of . We show an example for a -matrix. Let:
We then have:
Where the last line is called the matrix's characteristic polynomial. We easily see that ours has a double root of and a single root of , and we thus have two eigenvalues, and .
To find the corresponding eigenvectors of each eigenvalue, we insert the eigenvalues one at a time into the equation:
We start with the first eigenvalue.
We immediately get a zero row but do an elementary row operation before we set parameters.
Now we have two zero rows. Remember that at least one zero row is expected, because our basic assumption is that the determinant is zero. We set two parameters.
and thus we have a solution space that is a plane, namely:
Since the solution space, also called the eigenspace, is a plane, we are looking for two eigenvectors.
We select the first eigenvector with and the second eigenvector with . All values for and work well as long as and are not parallel. We get:
We continue with the second eigenvalue and get:
We immediately get a zero row! We introduce a parameter and continue to solve with the Gaussian-Jordan method:
The solution space is thus a line:
We choose and have the third eigenvector.
Which concludes the answer, but we summarize all eigenvalues and eigenvectors here:
What is an eigenspace?
Let be defined as:
with corresponding eigenvalues and eigenvectors:
If is an eigenvalue of , it means that the following equation has non-trivial solutions as solution spaces:
which we call eigenspaces of for corresponding eigenvalues. We have found these eigenspaces in the step before we selected eigenvectors, so here we show how we define the eigenspaces when we have the eigenvectors given.
Because we have two eigenvalues, we have two eigenspaces (number of eigenvalues = number of eigenspaces). We note these as we have done before for ordinary solution spaces:
This leads us to introduce two additional definitions, algebraic multiplicity and geometric multiplicity.
Let matrix have eigenvalues , which have been calculated from its characteristic polynomial . Then:
The algebraic multiplicity of corresponds to its degree of root to . For example, if the degree is two for , then it is a double root of . It therefore has an algebraic multiplicity .
The geometric multiplicity of corresponds to the dimension of its eigenspace, that is, the solution space of . For example, if the solution space for is a plane, then the eigenspace has two dimensions, and thus its geometric multiplicity is .
For each , the geometric multiplicity is less than or equal to its algebraic multiplicity.
For visual learners
Animations and explanations by 3blue1brown are appreciated by many, especially by those who learn best with the help of video.