Skip to main content

Section 2.4 Generalized eigenvectors

Up to this point, eigenvalues and eigenvectors have helped us find some standard forms of operators. In particular, we have seen that if an operator \(T\) on \(V\text{,}\) an \(n\)-dimensional vector space has \(n\) linearly independent eigenvectors, then \(T\) is diagonalizable. We also know that this condition holds if \(T\) has \(n\) distinct eigenvalues or if the operator is self-adjoint.
However, there are examples where this does not apply. For instance, the matrix \(A=\begin{bmatrix} 2 \amp 1 \\ 0 \amp 2 \\ \end{bmatrix}\) has a single eigenvalue \(\lambda=2\) and the associated eigenspace \(E_2\) is one-dimensional. In this case, the characteristic polynomial is \((\lambda-2)^2\) so the eigenvalue \(\lambda=2\) is a root with multiplicity two.
This example show that just looking at eigenvalues and eigenvectors will not be enough to always find a standard form. So instead, we do something that mathematicians love to do: generalize an idea that has already proven useful. In this case, the eigenvalue/eigenvector condition is given by the equation
\begin{equation*} (T-\lambda I)\vvec = \zerovec\text{.} \end{equation*}
We will generalize this equation to
\begin{equation*} (T-\lambda I)^k\vvec = \zerovec \end{equation*}
for some \(k\) and call the solutions generalized eigenvectors.

Subsection 2.4.1 Generalized eigenvectors

Definition 2.4.1.

If \(T\) is an operator on \(V\text{,}\) we say that a nonzero vector \(\vvec\) is a generalized eigenvector if \((T-\lambda I)^k\vvec=\zerovec\) for some \(k\text{.}\) The set of such vectors, the generalized eigenspace is denoted \(G_\lambda\text{.}\)
Notice that eigenvectors are also generalized eigenvectors since an eigenvector satisfies \((T-\lambda I)\vvec=0\text{,}\) the generalized eigenvector condition with \(k=1\text{.}\)
We would like to characterize a generalized eigenspace as a subspace of \(V\text{.}\) Before doing so, however, we recall an earlier homework exercise.
With this in mind, we can characterize the generalized eigenspaces.

Proof.

Suppose that \(\vvec\) is a generalized eigenvector, which means that \((T-\lambda I)^l\vvec=0\) for some \(l\text{.}\) If \(s(x) = (x-\lambda)^l\text{,}\) then \(s(T)\vvec=0\text{,}\) which means that \(p_\vvec\) divides \(s\text{.}\) Therefore, \(p_\vvec=(x-\lambda)^m\) for some \(m\leq l\text{.}\)
The minimal polynomiall \(p\) of \(T\) has exactly \(k\) factors of \(x-\lambda\text{.}\) Since \(p_\vvec\) divides \(p\text{,}\) \(p_\vvec\) can have no more than \(k\) factors of \(x-\lambda\text{.}\) Therefore,
\begin{equation*} (T-\lambda I)^k\vvec = (T-\lambda I)^{k-m}(T-\lambda I)^m\vvec = 0\text{,} \end{equation*}
which says that \(\vvec\) is in \(\nul((T-\lambda I)^k)\text{.}\)
We know that a vector cannot be an eigenvector associated to two different eigenvalues. This is also true for generalized eigenvectors.

Proof.

Suppose that \(\vvec\) is a nonzero vector in \(G_\lambda\text{.}\) Then \(p_\vvec(x)=(x-\lambda)^m\) for some \(m\text{,}\) which says that \(\lambda\) is a root of \(p_\vvec\text{.}\)
At the same time, if \(\vvec\) is in \(G_\mu\text{,}\) then \(p_\vvec(x) = (x-\mu)^n\) for some \(n\text{.}\) Since
\begin{equation*} p_\vvec(\lambda) = (\lambda-\mu)^n = 0\text{,} \end{equation*}
we must have \(\lambda = \mu\text{.}\)

Subsection 2.4.2 Complex Vector Spaces

Because of the Fundamental Theorem of Algebra, operators on complex vector spaces have special properties. In particular, the minimal polynomial of an operator on a complex vector space has the form
\begin{equation*} p(x)=(x-\lambda_1)^{k_1}(x-\lambda_2)^{k_2}\ldots(x-\lambda_1)^{k_m}\text{.} \end{equation*}
In this case, we claim that \(V\) is a direct sum of generalized eigenspaces.

Proof.

We will use induction on the dimension of \(V\text{.}\) To establish the base case, we assume that \(\dim V = 1\text{.}\) In this case, \(p(x) = x-\lambda\) so \(T-\lambda I = 0\) or \(T=\lambda I\text{.}\) Then \(V=G_\lambda\text{.}\)
For the inductive step, we will assume the result is true for all vector spaces of dimension less than \(\dim V\text{.}\) We will choose an eigenvalue \(\lambda\) and write the minimal polynomial \(p\) as
\begin{equation*} p(x) = q(x)(x-\lambda)^k \end{equation*}
where \(q(\lambda)\neq 0\text{.}\)
Notice that
\begin{equation*} \nul((T-\lambda I)^k) \cap \range((T-\lambda)^k) = \{0\}\text{.} \end{equation*}
To see this, suppose that \(\vvec\) is in this intersection. Then \(\vvec = (T-\lambda)^k\uvec\) for some vector \(\uvec\text{.}\) Moreover,
\begin{equation*} 0 = (T-\lambda I)^k\vvec = (T-\lambda I)^{2k}\uvec\text{.} \end{equation*}
This implies that \(p_\uvec\) divides \((x-\lambda)^{2k}\) so that the only factors of \(p_\uvec\) are \(x-\lambda\text{.}\) Since \(p_\uvec\) also divides the minimal polynomial \(p\text{,}\) we also know that \(p_\uvec\) divides \((x-\lambda)^k\text{.}\) Therefore,
\begin{equation*} \vvec = (T-\lambda I)^k\uvec = 0\text{.} \end{equation*}
Because of Proposition 1.2.13, we also know that
\begin{equation*} \dim\nul((T-\lambda I)^k) +\dim \range((T-\lambda I)^k) = \dim V\text{,} \end{equation*}
which says that
\begin{equation*} V = \nul((T-\lambda I)^k) \oplus \range((T-\lambda I)^k)\text{.} \end{equation*}
If we define \(U=\range((T-\lambda I)^k)\text{,}\) then we also have
\begin{equation*} V = G_\lambda \oplus U\text{.} \end{equation*}
Since we have written the minimal polynomial \(p(x)=q(x)(x-\lambda)^k\text{,}\) we can see that the minimal polynomial of \(T|_U\) is \(q\text{.}\) For instance, if \(\uvec\) is in \(U\text{,}\) then \(\uvec=(T-\lambda I)^k\vvec\) for some vector \(V\text{.}\) Then
\begin{equation*} q(T)\uvec = q(T)(T-\lambda I)^k \vvec = p(T)\vvec = 0\text{.} \end{equation*}
This shows that the minimal polynomial \(p_U\) of \(T|_U\) divides \(q\text{.}\) However, if the minimal polynomial of \(T|_U\) has a smaller degree than \(q\text{,}\) this would contradcit the fact that the minimal polynomial of \(T\) has the smallest possible degree.
By the inductive hypothesis, \(U\) may be written as a direct sum of its generalized eigenspaces. All that remains is to show that, if \(\mu\) is an eigenvalue distinct from \(\lambda\text{,}\) the generalized eigenspace of \(T\) associated to \(\mu\) is the same as the generalized eigenspace of \(T|_U\) associated to \(\mu\text{.}\) To this end, suppose that \(\vvec\) satisfies \((T-\mu I)^l\vvec\) for some \(l\text{.}\) Because \(V=G_\lambda \oplus U\text{,}\) we can write
\begin{equation*} \vvec = \nvec + \uvec \end{equation*}
where \(\nvec\in G_\lambda\) and \(\uvec \in U\text{.}\) We have
\begin{equation*} 0 = (T-\mu I)^l\vvec = (T-\mu I)^l\nvec + (T-\mu I)^l\uvec\text{.} \end{equation*}
Because these subspaces are \(T\)-invariant, we have
\begin{align*} (T-\mu I)^l\nvec \amp = 0\text{.} \end{align*}
This means that \(\nvec\in G_\lambda\cap G_\mu = \{0\}\) by Proposition 2.4.4 so that \(\nvec = 0\text{.}\) Therefore, \(\vvec = \uvec\in U\) and
\begin{equation*} (T-\mu I)^l\uvec = (T|_U-\mu I)^l \uvec = 0\text{.} \end{equation*}

Subsection 2.4.3 Jordan form

We are now ready to proof the main structure theorem, again assuming that \(V\) is a complex vector space. By a Jordan block, we mean a matrix \(J\) whose diagonal entries are all \(\lambda\text{,}\) whose entries directly above the diagonal are 1, and whose other entries are all zero. That is,
\begin{equation*} J = \begin{bmatrix} \lambda \amp 1 \amp 0 \amp \ldots \amp 0 \\ 0 \amp \lambda \amp 1 \amp \ldots \amp 0 \\ \vdots \amp \vdots \amp \ddots \amp \ldots \amp \vdots \\ 0 \amp 0 \amp 0 \amp \ldots \amp \lambda \\ \end{bmatrix}\text{.} \end{equation*}

Proof.

We know that
\begin{equation*} V = G_{\lambda_1}\oplus G_{\lambda_2}\oplus\ldots \oplus G_{\lambda_m}\text{.} \end{equation*}
Moreover, on \(G_{\lambda_j}\text{,}\) the operator \((T-\lambda_j I)^{k_j}\) is nilpotent, which means there is a basis for \(G_{\lambda_j}\) such that the matrix associated to \(T-\lambda_j\) conists of nilpotent blocks. The matrix associated to \(T\) therefore has consists of Jordan blocks. Because each generalized eigenspace is invariant under \(T\text{,}\) the theorem holds.
Notice that the characteristic polynomial of \(T\text{,}\) which can easily be found using this matrix, has the form
\begin{equation*} p(x)=(x-\lambda_1)^{m_1}(x-\lambda_2)^{m_2}\ldots(x-\lambda_k)^{m_k} \end{equation*}
where the multiplicity of each eigenvalue equals the dimension \(\dim G_{\lambda_j}\text{.}\) We earlier called \(m_j\) the algebraic multiplicity of the eigenvalue \(\lambda_j\text{.}\) Because the eigenspace \(E_{\lambda_j}\subset G_{\lambda_j}\text{,}\) have therefore shown that the multiplicity of the eigenvalue is at least as large as the dimension of the eigenspace:
\begin{equation*} 0\lt \dim E^{\lambda_j} \leq m_j\text{.} \end{equation*}