Upper triangular matrices

Section 2.1 Upper triangular matrices

In this section, we will use our understanding of the minimal polynomial to find some standard forms for matrices of operators. First, we will consider upper triangular matrices.

🔗

As we have seen in the past, upper triangular matrices have some simple properties. For one, the eigenvalues of the associated operator equal the diagonal elements of the matrix. We have also seen that linear systems formed by triangular matrices are relatively easy to solve. All told, they form a pleasant set of matrices.

🔗

Remember that an upper triangular matrix is one whose entries are zero below the diagonal; that is, they have a form like this:

\begin{equation*} \begin{bmatrix} * \amp * \amp \ldots \amp * \amp * \\ 0 \amp * \amp \ldots \amp * \amp * \\ \vdots \amp 0 \amp \ddots \amp * \amp * \\ 0 \amp 0 \amp \ldots \amp 0 \amp * \\ \end{bmatrix}\text{.} \end{equation*}

Suppose that \(T\) is an operator and that there is a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) for which the associated matrix \(A\) for \(T\) is upper triangular. Since \(A_{k,j} = 0\) if \(k \gt j\text{,}\) we have

\begin{align} T\vvec_1 \amp = A_{1,1}\vvec_1\tag{2.1.1}\\ T\vvec_2 \amp = A_{1,2}\vvec_1 + A_{2,2}\vvec_2\tag{2.1.2}\\ T\vvec_3 \amp = A_{1,3}\vvec_1 + A_{2,3}\vvec_2 + A_{3,3}\vvec_3\tag{2.1.3} \end{align}

and so forth.

🔗

Notice that (2.1.1) says that \(T\vvec_1 = A_{1,1}\vvec_1\) so that \(\vvec_1\) is an eigenvector with associated eigenvalue \(A_{1,1}\text{.}\)

🔗

In addition, (2.1.2) tells us that

\begin{equation*} T\vvec_2 = A_{2,1}\vvec_1 + A_{2,2}\vvec_2\text{,} \end{equation*}

which implies that \(\laspan{\vvec_1,\vvec_2}\) is invariant under \(T\text{.}\)

🔗

More generally,

\begin{equation*} T\vvec_j = A_{j,1}\vvec_1 + A_{j,2}\vvec_2 + \ldots + A_{k,k}\vvec_k\text{,} \end{equation*}

which says that, for every \(j\text{,}\) \(\laspan{\vvec_1,\vvec_2,\ldots,\vvec_j}\) is invariant under \(T\text{.}\)

🔗

Let’s record this as a proposition.

🔗

Proposition 2.1.1.

Suppose that \(T\) is an operator on the vector space \(V\text{.}\) Then there is a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) for \(V\) in which the associated matrix of \(T\) is upper triangular if and only if \(\laspan{\vvec_1,\vvec_2,\ldots,\vvec_j}\) is invariant under \(T\) for each \(j\text{.}\)

🔗

Proof.

The discussion above explains why an operator with an upper triangular matrix forms invariant subspaces. Conversely, suppose that \(\laspan{\vvec_1,\vvec_2,\ldots,\vvec_j}\) is invariant for every \(j\text{.}\) The matrix \(A\) associated to \(T\) satisfies

\begin{equation*} T\vvec_j = \sum_{k=1}^n A_{k,j}\vvec_k = \sum_{k=1}^jA_{k,j}\vvec_k\text{,} \end{equation*}

which shows that \(A_{k,j}=0\) if \(k\gt j\) and that \(A\) is therefore upper triangular.

🔗

We can rephrase this in terms of polynomials.

🔗

Proposition 2.1.2.

Suppose that \(T\) is an operator on \(V\) and that there is a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) of \(V\) such that the matrix \(A\) of \(T\) is upper triangular. Then

\begin{equation*} (T-A_{1,1}I) (T-A_{2,2}I) \ldots (T-A_{n,n}I) = 0\text{.} \end{equation*}

🔗

Proof.

We will use \(q\) to denote the polynomial

\begin{equation*} q(x)=(x-A_{1,1})(x-A_{2,2})\ldots(x-A_{n,n})\text{.} \end{equation*}

🔗

First consider \(\vvec_1\text{.}\) Since \(A\) is upper triangular, we know that \(T\vvec_1=A_{1,1}\vvec_1\text{,}\) which means that \((T-A_{1,1}I)\vvec_1 = 0\text{.}\) Therefore, \(q(T)\vvec_1 = 0\text{.}\)

🔗

Next consider \(\vvec_2\) for \(T\vvec_2 = A_{1,2}\vvec_1 + A_{2,2}\vvec_2\text{.}\) Rearranging gives

\begin{equation*} (T-A_{2,2}I)\vvec_2 = A_{1,2}\vvec_1\text{.} \end{equation*}

Therefore,

\begin{equation*} (T-A_{1,1}I)(T-A_{2,2}I)\vvec_2 = A_{1,2}(T-A_{1,1}I)\vvec_1 = 0\text{,} \end{equation*}

which tells us that \(q(T)\vvec_2=0\text{.}\)

🔗

Continuing in this way, we see that

\begin{equation*} (T-A_{1,1}I)(T-A_{2,2}I)\ldots (T-A_{j,j}I)\vvec_j = 0 \end{equation*}

and hence that \(q(T)\vvec_j=0\text{.}\) This means that \(q(T)\vvec=0\) for every vector \(\vvec\) and so we know that \(q(T)=0\) as claimed.

🔗

This leads to the following crucial result.

🔗

Theorem 2.1.3.

Suppose that \(T\) is an operator on \(V\) over the field \(\field\text{.}\) There is a basis for \(V\) for which the matrix of \(T\) is upper triangular if and only if the minimal polynomial of \(T\) has the form

\begin{equation*} p(x) = (x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_m) \end{equation*}

where the roots \(\lambda_j\) are in \(\field\text{.}\)

🔗

Moreover, when this happens, the diagonal entries of the matrix are the eigenvalues of \(T\text{.}\)

🔗

Proof.

One direction is a natural consequence of Proposition 2.1.2. Suppose that there is a basis for which the matrix \(A\) of \(T\) is upper triangular. That proposition tells us that \(q(T) = 0\) for

\begin{equation*} q(x)=(x-A_{1,1})(x-A_{2,2})\ldots(x-A_{n,n})\text{.} \end{equation*}

Since \(q(T) = 0\text{,}\) Proposition 1.4.29 tells us that \(q\) is a multiple of the minimal polynomial of \(T\text{,}\) which says that the minimal polynomial has the form given in the statement of this theorem. Because the eigenvalues are the roots of the minimal polynomial, we also know that the diagonal entries of the matrix are the eigenvalues.

🔗

Now suppose that the minimal polynomial \(p\) has the form given. We will prove by induction that there is a basis for which the matrix of \(T\) is upper triangular.

🔗

First suppose that \(m=1\) so that \(T-\lambda_1I = 0\text{.}\) This means that \(T=\lambda_1 I\) so that the matrix of \(T\) in any basis is diagonal and hence upper triangular. Notice that the diagonal entries of this matrix are \(\lambda_1\text{,}\) which is an eigenvalue of \(T\text{.}\)

🔗

Let’s now suppose that the result is true for any operator and vector space for which the minimal polynomial has the form

\begin{equation*} p(x) = (x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_k) \end{equation*}

where \(k\lt m\text{.}\) Consider the subspace \(W=\range(T-\lambda_mI)\text{,}\) which we know is invariant under \(T\text{.}\) Since any vector \(\wvec\) in \(W\) has the form \(\wvec = (T-\lambda_mI)\uvec\text{,}\) we know that

\begin{equation*} (T-\lambda_1I)(T-\lambda_2I)\ldots(T-\lambda_{m-1}I)\wvec = \zerovec\text{.} \end{equation*}

Therefore, if

\begin{equation*} q(x)= (x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_{m-1})\text{,} \end{equation*}

then \(q(T|_W) = 0\text{.}\) As a result, \(q\) is a multiple of the minimal polynomial of \(T|_W\) and so the minimal polynomial of \(T|_W\) is a product of fewer than \(m\) terms having the form \(x-\alpha_j\text{.}\) By the induction hypothesis, we know there is a basis \(\wvec_1,\wvec_2,\ldots,\wvec_k\) for \(W\) so that the basis of \(T|_W\) is upper triangular.

🔗

We will now extend the basis of \(W\) by vectors \(\vvec_1,\vvec_2,\ldots,\vvec_{n-k}\text{.}\) Since \(T\vvec_j\) is in \(W\text{,}\) we have

\begin{equation*} T\vvec_j = (T-\lambda_mI)\vvec_j + \lambda_m\vvec_j \end{equation*}

\begin{equation*} T\vvec_j = A_{j,k}\wvec_k + \lambda_m\vvec_j\text{,} \end{equation*}

which shows that the matrix of \(T\) is upper triangular. Furthermore, the diagonal entry of the matrix associated to \(\vvec_j\) is \(\lambda_m\text{,}\) which shows that the diagonal entries of the matrix are the eigenvalues of \(T\text{.}\)

🔗

Notice that the Fundamental Theorem of Algebra" tells us that any polynomial with complex coefficients has the form given in Theorem 2.1.3. As a result, any operator on a complex vector space \(V\) has a basis for which the associated matrix is upper triangular.

🔗

Proposition 2.1.4.

If \(T\) is an operator on a finite dimensional complex vector space \(V\text{,}\) then there is a basis \(\bcal\) for \(V\) such that \(\coords{T}{\bcal}\) is upper triangular.

🔗

Example 2.1.5.

Let’s consider the operator \(T:\real^3\to\real^3\) defined by the matrix \(A=\begin{bmatrix} 2 \amp 8 \amp 8 \\ 6 \amp 3 \amp 9 \\ 10 \amp 7 \amp 1 \end{bmatrix}\text{.}\) The minimal polynomial of \(T\) is

\begin{equation*} p(x)=(x-18)(x+6)^2\text{,} \end{equation*}

which satisfies the hypothesis of Theorem 2.1.3. Therefore, there is a basis \(\bcal\) so that \(\coords{T}{\bcal}\) is upper triangular. We will find such a basis following the proof of the theorem.

🔗

Consider one factor \(x+6\) of the minimal polynomial of \(T\) and construct \(U_1 = \range(T+6I)\text{.}\) We see that

\begin{equation*} A + 6I = \begin{bmatrix} 8 \amp 8 \amp 8 \\ 6 \amp 9 \amp 9 \\ 10 \amp 7 \amp 7 \end{bmatrix} \sim \begin{bmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 0 \\ \end{bmatrix}\text{.} \end{equation*}

This shows that a basis for \(U_1\) consists of the vectors

\begin{equation*} \uvec_1 = \threevec86{10},\hspace{12pt} \uvec_2 = \threevec89{7}\text{.} \end{equation*}

We have

\begin{align*} T\uvec_1 \amp = 2\uvec_1 + 16\uvec_2\\ T\uvec_2 \amp = 8\uvec_1 + 10\uvec_2 \end{align*}

so that the matrix \(\begin{bmatrix}2 \amp 8 \\ 16 \amp 10 \end{bmatrix}\) represents \(T|_{U_1}\) in this basis.

🔗

Of course, we know that the minimal polynomial of \(T|_{U_1}\) is \((x-18)(x+6)\) so we consider the operator \(T|_{U_1}+6I\) on \(U_1\text{,}\) which is represented by the matrix \(\begin{bmatrix}8 \amp 8 \\ 16 \amp 16 \end{bmatrix}\) in this basis. Therefore, a basis for \(U_2 = \range(T|_{U_1}+6I)\) is

\begin{equation*} \vvec_1 = 8\uvec_1 + 16\uvec_2 = \threevec{192}{192}{192}\text{.} \end{equation*}

We note that \(T\vvec_1 = 18\vvec_1\text{,}\) as expected since the minimial polynomial of \(T|_{U_2}\) is \(x-18\text{.}\)

🔗

We will now work our way back out to find a basis for \(U_1=\laspan{\uvec_1,\uvec_2}\text{.}\) We have \(\vvec_1 = 8\uvec_1+16\uvec_2\) so we will take \(\vvec_2 = \uvec_1=\threevec86{10}\text{.}\) (There are lots of possible choices since we just need to choose a vector in \(U_1\) that is not a scalar multiple of \(\vvec_1\text{.}\)) Notice that

\begin{align*} T\vvec_1 \amp = 18\vvec_1\\ T\vvec_2 \amp = \vvec_1 - 6\vvec_2\text{.} \end{align*}

🔗

We just need to find a vector \(\vvec_3\) so that \(\{\vvec_1,\vvec_2,\vvec_3\}\) forms a basis for \(\real^3\text{.}\) Again, there are lots of possibilities so we choose \(\vvec_3 = \threevec100\text{.}\) We see that

\begin{align*} T\vvec_1 \amp = 18\vvec_1\\ T\vvec_2 \amp = \vvec_1 - 6\vvec_2\\ T\vvec_3 \amp = \vvec_2 - 6\vvec_3\text{.} \end{align*}

🔗

If we call the basis \(\bcal=\{\vvec_1,\vvec_2,\vvec_3\}\text{,}\) then

\begin{equation*} B = \coords{T}{\bcal}= \begin{bmatrix} 18 \amp 1 \amp 0 \\ 0 \amp -6 \amp 1 \\ 0 \amp 0 \amp -6 \\ \end{bmatrix}\text{,} \end{equation*}

which shows that we have found an upper triangular matrix representing \(T\text{.}\)

🔗

Thinking in terms of matrices, we construct

\begin{equation*} P=\begin{bmatrix}\vvec_1 \amp \vvec_2 \amp \vvec_3 \end{bmatrix} = \begin{bmatrix} 192 \amp 8 \amp 1 \\ 192 \amp 6 \amp 0 \\ 192 \amp 10 \amp 0 \end{bmatrix} \end{equation*}

and find that our original matrix \(A=PBP^{-1}\text{.}\) In other words, we have demonstrated that \(A\) is similar to an upper triangular matrix, as predicted by Theorem 2.1.3.

🔗

Prev Top Next