Subsection 2.1.1 Upper triangular matrices
As we have seen in the past, upper triangular matrices have some simple properties. For one, the eigenvalues of the associated operator equal the diagonal elements of the matrix. We have also seen that linear systems formed by triangular matrices are relatively easy to solve. All told, they form a pleasant set of matrices.
Reember that an upper triangular matrix is one whose entries are zero below the diagonal; that is, they have a form like this:
\begin{equation*}
\begin{bmatrix}
* \amp * \amp \ldots \amp * \amp * \\
0 \amp * \amp \ldots \amp * \amp * \\
\vdots \amp 0 \amp \ddots \amp * \amp * \\
0 \amp 0 \amp \ldots \amp 0 \amp * \\
\end{bmatrix}\text{.}
\end{equation*}
Suppose that \(T\) is an operator and that there is a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) for which the associated matrix \(A\) for \(T\) is upper triangular. Since \(A_{k,j} = 0\) if \(k \gt j\text{,}\) we have
\begin{align}
T\vvec_1 \amp = A_{1,1}\vvec_1\tag{2.1.1}\\
T\vvec_2 \amp = A_{1,2}\vvec_1 + A_{2,2}\vvec_2\tag{2.1.2}\\
T\vvec_3 \amp = A_{1,3}\vvec_1 + A_{2,3}\vvec_2 + A_{3,3}\vvec_3\tag{2.1.3}
\end{align}
and so forth.
Notice that
(2.1.1) says that
\(T\vvec_1
= A_{1,1}\vvec_1\) so that
\(\vvec_1\) is an eigenvector with associated eigenvalue
\(A_{1,1}\text{.}\)
\begin{equation*}
T\vvec_2 = A_{2,1}\vvec_1 + A_{2,2}\vvec_2\text{,}
\end{equation*}
which implies that \(\laspan{\vvec_1,\vvec_2}\) is invariant under \(T\text{.}\)
More generally,
\begin{equation*}
T\vvec_j = A_{j,1}\vvec_1 + A_{j,2}\vvec_2 + \ldots +
A_{k,k}\vvec_k\text{,}
\end{equation*}
which says that, for every \(j\text{,}\) \(\laspan{\vvec_1,\vvec_2,\ldots,\vvec_j}\) is invariant under \(T\text{.}\)
Let’s record this as a proposition.
Proposition 2.1.1.
Suppose that \(T\) is an operator on the vector space \(V\text{.}\) Then there is a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) for \(V\) in which the associated matrix of \(T\) is upper triangular if and only if \(\laspan{\vvec_1,\vvec_2,\ldots,\vvec_j}\) is invariant under \(T\) for each \(j\text{.}\)
Proof.
The discussion above explains why an operator with an upper triangular matrix forms invariant subspaces. Conversely, suppose that \(\laspan{\vvec_1,\vvec_2,\ldots,\vvec_j}\) is invariant for every \(j\text{.}\) The matrix \(A\) associated to \(T\) satisfies
\begin{equation*}
T\vvec_j = \sum_{k=1}^n A_{k,j}\vvec_k =
\sum_{k=1}^jA_{k,j}\vvec_k\text{,}
\end{equation*}
which shows that \(A_{k,j}=0\) if \(k\gt j\) and that \(A\) is therefore upper triangular.
We can rephrase this in terms of polynomials.
Proposition 2.1.2.
Suppose that \(T\) is an operator on \(V\) and that there is a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) of \(V\) such that the matrix \(A\) of \(T\) is upper triangular. Then
\begin{equation*}
(T-A_{1,1}I)
(T-A_{2,2}I)
\ldots
(T-A_{n,n}I) = 0\text{.}
\end{equation*}
Proof.
We will use \(q\) to denote the polynomial
\begin{equation*}
q(x)=(x-A_{1,1})(x-A_{2,2})\ldots(x-A_{n,n})\text{.}
\end{equation*}
First consider \(\vvec_1\text{.}\) Since \(A\) is upper triangular, we know that \(T\vvec_1=A_{1,1}\vvec_1\text{,}\) which means that \((T-A_{1,1}I)\vvec_1 = 0\text{.}\) Therefore, \(q(T)\vvec_1 = 0\text{.}\)
Next consider \(\vvec_2\) for \(T\vvec_2 =
A_{1,2}\vvec_1 + A_{2,2}\vvec_2\text{.}\) Rearranging gives
\begin{equation*}
(T-A_{2,2})\vvec_2 = A_{1,2}\vvec_1\text{.}
\end{equation*}
Therefore,
\begin{equation*}
(T-A_{1,1})(T-A_{2,2})\vvec_2 = A_{1,2}(T-A_{1,1)}\vvec_1 = 0\text{,}
\end{equation*}
which tells us that \(q(T)\vvec_2=0\text{.}\)
Continuing in this way, we see that
\begin{equation*}
(T-A_{1,1})(T-A_{2,2)}\ldots (T-A_{j,j})\vvec_j = 0
\end{equation*}
and hence that \(q(T)\vvec_j=0\text{.}\) This means that \(q(T)\vvec=0\) for every vector \(\vvec\) and so we know that \(q(T)=0\) as claimed.
This leads to the following crucial result.
Theorem 2.1.3.
Suppose that \(T\) is an operator on \(V\) over the field \(\field\text{.}\) There is a basis for \(V\) for which the matrix of \(T\) is upper triangular if and only if the minimal polynomial of \(T\) has the form
\begin{equation*}
p(x) = (x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_m)
\end{equation*}
where the roots \(\lambda_j\) are in \(\field\text{.}\)
Moreover, when this happens, the diagonal entries of the matrix are the eigenvalues of \(T\text{.}\)
Proof.
One direction is a natural consequence of
Proposition 2.1.2. Suppose that there is a basis for which the matrix
\(A\) of
\(T\) is upper triangular. That proposition tells us that
\(q(T) = 0\) for
\begin{equation*}
q(x)=(x-A_{1,1})(x-A_{2,2})\ldots(x-A_{n,n})\text{.}
\end{equation*}
Since \(q(T) = 0\text{,}\) we know that \(q\) is a multiple of the minimal polynomial of \(T\text{,}\) which says that the minimal polynomial has the form given in the statement of this theorem. Because the eigenvalues are the roots of the minimal polynomial, we also know that the diagonal entries of the matrix are the eigenvalues.
Now suppose that the minimal polynomial \(p\) has the form given. We will prove by induction that there is a basis for which the matrix of \(T\) is upper triangular.
First suppose that \(m=1\) so that \(T-\lambda_1I =
0\text{.}\) This means that \(T=\lambda_1 I\) so that the matrix of \(T\) in any basis is diagonal and hence upper triangular. Notice that the diagonal entries of this matrix are \(\lambda_1\text{,}\) which is an eigenvalue of \(T\text{.}\)
Let’s now suppose that the result is true for any operator and vector space for which the minimal polynomial has the form
\begin{equation*}
p(x) = (x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_k)
\end{equation*}
where \(k\lt m\text{.}\) Consider the subspace \(W=\range(T-\lambda_mI)\text{,}\) which we know is invariant under \(T\text{.}\) Since any vector \(\wvec\) in \(W\) has the form \(\wvec = (T-\lambda_mI)\uvec\text{,}\) we know that
\begin{equation*}
(T-\lambda_1I)(T-\lambda_2I)\ldots(T-\lambda_{m-1}I)\wvec = \zerovec\text{.}
\end{equation*}
Therefore, if
\begin{equation*}
q(x)=
(x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_{m-1})\text{,}
\end{equation*}
then \(q(T|_W) = 0\text{.}\) As a result, \(q\) is a multiple of the minimal polynomial of \(T|_W\) and so the minimal polynomial of \(T|_W\) is a product of fewer than \(m\) terms having the form \(x-\alpha_j\text{.}\) By the induction hypothesis, we know there is a basis \(\wvec_1,\wvec_2,\ldots,\wvec_k\) for \(W\) so that the basis of \(T|_W\) is upper triangular.
We will now extend the basis of \(W\) by vectors \(\vvec_1,\vvec_2,\ldots,\vvec_{n-k}\text{.}\) Since \(T\vvec_j\) is in \(W\text{,}\) we have
\begin{equation*}
T\vvec_j = (T-\lambda_mI)\vvec_j + \lambda_m\vvec_j
\end{equation*}
or
\begin{equation*}
T\vvec_j = A_{j,k}\wvec_k + \lambda_m\vvec_j\text{,}
\end{equation*}
which shows that the matrix of \(T\) is upper triangular. Furthermore, the diagonal entry of the matrix associated to \(\vvec_j\) is \(\lambda_m\text{,}\) which shows that the diagonal entries of the matrix are the eigenvalues of \(T\text{.}\)
Notice that the
Theorem 1.4.5 tells us that any polynomial with complex coefficients has the form given in
Theorem 2.1.3. As a result, any operator on a complex vector space
\(V\) has a basis for which the associated matrix is upper triangular.