Proof.
Suppose that \(\vvec\) is in \(\nul(p(T))\text{,}\) which means that \(p(T)\vvec = \zerovec\text{.}\) We need to explain why \(T\vvec\) is also in \(\nul(p(T))\text{,}\) which leads us to
\begin{equation*}
p(T)T\vvec = Tp(T)\vvec = T\zerovec = \zerovec\text{.}
\end{equation*}
Similarly, if \(\vvec\) is in \(\range(p(T))\text{,}\) then there is a vector \(\uvec\) so that \(\vvec=p(T)\uvec\text{.}\) Then
\begin{equation*}
T\vvec = Tp(T)\uvec = p(T)(T\uvec)\text{,}
\end{equation*}
which shows that \(T\vvec\) is also in \(\range(p(T))\text{.}\) Therefore, \(\range(p(T))\) is an invariant subspace of \(T\text{.}\)
Proof.
Our proof proceeds by induction on the dimension of
\(V\text{.}\) To begin, suppose that
\(\dim(V) = 1\text{,}\) which means that
\(V=\laspan{\vvec}\) for some vector
\(\vvec\text{.}\) Then
\(T\vvec = \lambda\vvec\) for some
\(\lambda\text{,}\) which is possibly
\(0\text{.}\) Then
\((T-\lambda I)\vvec = \zerovec\text{,}\) which means that
\(T-\lambda I = 0\) since
\(\vvec\) spans
\(V\text{.}\) Therefore, if
\(p(x) = x-\lambda\text{,}\) we have
\(p(T)=0\text{.}\)
We now imagine that \(\dim(V)=n\) and that the theorem has been verified for all operators on vector spaces of dimension less than \(n\text{.}\) We choose a vector \(\vvec\) and consider the powers \(T^k\vvec\text{;}\) that is, consider the vectors
\begin{equation*}
\vvec,T\vvec,T^2\vvec,\ldots,T^n\vvec\text{.}
\end{equation*}
Since there are \(n+1\) vector in this set and \(\dim(V)=n\text{,}\) we know this is a linearly dependent set.
Choose \(m\) to be the smallest integer such that \(T^m\vvec\) is a linear combination of \(\vvec,T\vvec,\ldots,T^{m-1}\vvec\text{.}\) This means two things. First, the vectors \(\vvec,T\vvec,\ldots,T^{m-1}\vvec\) are linearly independent. Second, there are constants
\begin{equation*}
a_0\vvec + a_1T\vvec + \ldots + a_{m-1}T^{m-1}\vvec +
T^m\vvec = \zerovec\text{.}
\end{equation*}
If we define the degree \(m\) monic polynomial
\begin{equation*}
p(x)=a_0 + a_1x + \ldots + a_{m-1}x^{m-1}+x^m\text{,}
\end{equation*}
then \(p(T)\vvec = \zerovec\text{.}\) That is, \(\vvec\) is in \(\nul(p(T))\text{.}\)
Since \(\nul(p(T))\) is invariant under \(T\) and \(\vvec\) is in \(\nul(p(T))\text{,}\) we know that
\begin{equation*}
\vvec,T\vvec,\ldots,T^{m-1}\vvec
\end{equation*}
are all in \(\nul(p(T))\text{.}\) These vectors are linearly independent so we know that \(\dim(\nul(p(T)))\geq m\text{.}\) Therefore,
\begin{equation*}
\dim(\range(p(T))) = \dim(V) - \dim(\nul(p(T))) \leq n - m\text{.}
\end{equation*}
For convenience, we will denote the vector space
\(W=\range(p(T))\text{.}\) Since
\(W\) is invariant under
\(T\text{,}\) \(T\) is an operator on
\(W\text{,}\) whose dimension is less than
\(\dim(V)\text{.}\) By the induction hypothesis, we know that there is a unique monic polynomial
\(q(x)\) such that
\(q(T|_W)=0\text{.}\) Again by the induction hypothesis, it also follows that
\(\deg(q) \leq \dim(W) \leq n - m\text{.}\)
Now consider the polynomial \(pq\) whose degree is
\begin{equation*}
\deg(qp) = \deg(q)+\deg(p) \leq n - m + m \leq n = \dim(V)\text{.}
\end{equation*}
Moreover, both \(p\) and \(q\) are monic so \(pq\) is also monic. Finally, for any vector \(\vvec\) in \(V\text{,}\) we have
\begin{equation*}
(qp)(T)\vvec = q(T)p(T)\vvec = q(T)(p(T)\vvec) = \zerovec
\end{equation*}
where the last equality holds because \(p(T)\vvec\) is in \(W=\range(p(T))\) and \(q(T)\uvec=\zerovec\) for any vector \(\uvec\) in \(W\text{.}\) Since \((qp)(T)\vvec=\zerovec\) for every vector \(\vvec\text{,}\) this means that \((qp)(T)=0\text{.}\)
This shows that there is a monic polynomial
\(s\) such that
\(s(T)=0\) on
\(V\text{.}\) Therefore, there is some possibly different polynomial having the smallest degree among all such polynomials, and this is the minimal polynomial of the operator
\(T\) on
\(V\text{.}\)
To see that this polynomial is unique, suppose there are two monic polynomials \(s_1\) and \(s_2\) having smallest degree and \(s_1(T)=0\) and \(s_2(T)=0\text{.}\) If we consider \(s_1-s_2\text{,}\) we see that \(\deg(s_1-s_2)\lt
\deg(s_1)=\deg(s_2)\) since the highest degree terms of \(s_1\) and \(s_2\) have coefficients \(1\) and therefore cancel. Also,
\begin{equation*}
(s_1-s_2)(T) = s_1(T) - s_2(T) = 0\text{.}
\end{equation*}
However, this is impossible since \(s_1\) and \(s_2\) had the smallest possible degree among all polynomials that vanish when evaluated on \(T\text{.}\) This means that \(s_1=s_2\text{,}\) which guarantees the uniqueness of the minimal polynomial.
Proof.
Suppose that
\(p\) is the minimal polynomial of
\(T\text{.}\) We need to explain two things: that any eigenvalue of
\(T\) is a root of
\(p\) and that any root of
\(p\) is an eigenvalue of
\(T\text{.}\)
Suppose that \(\lambda\) is an eigenvalue of \(T\text{.}\) This means that there is a nonzero vector \(\vvec\) such that \(T\vvec = \lambda\vvec\) and therefore \(T^j\vvec =
\lambda^j\vvec\) for every \(j\text{.}\) This means that
\begin{equation*}
0 = p(T)\vvec = p(\lambda)\vvec\text{,}
\end{equation*}
which implies that \(p(\lambda) = 0\text{.}\) Therefore, \(\lambda\) is a root of \(p\text{,}\) the minimal polynomial of \(T\text{.}\)
Conversely, suppose that
\(\lambda\) is a root of
\(p\text{.}\) By
PropositionΒ 1.4.4, this means that
\begin{equation*}
p(x) = (x-\lambda)q(x)\text{.}
\end{equation*}
This says that
\begin{equation*}
0 = p(T) = (T-\lambda I)q(T)
\end{equation*}
However, \(q(T)\neq 0\) since \(\deg(q) \lt \deg(p)\text{,}\) there is some vector \(\vvec\) for which \(q(T)\vvec\neq
0\text{.}\) Therefore,
\begin{equation*}
0 = p(T)\vvec = (T-\lambda I)q(T)\vvec\text{,}
\end{equation*}
which shows that \(q(T)\vvec\) is an eigenvector \(T\) with associated eigenvalue \(\lambda\text{.}\)