Skip to main content

Section 2.2 The Spectral Theorem

In Section 2.1, we saw conditions that enable us to represent a linear transformation as an upper triangular matrix. This is our first theorem about a standard form, and it puts us in a position to prove an important result that we used earlier called the Spectral Theorem.
The version of the Spectral Theorem that we saw concerns real symmetric matrices, which are square matrices for which \(A=A^T\text{.}\) This necessarily means that we are working in an inner product space so we will first extend our results on upper triangular matrices to operators on inner product spaces.

Subsection 2.2.1 The Schur decomposition

We will first consider complex vector spaces. In particular, suppose that \(V\) is a finite-dimensional complex inner product space and that \(T:V\to V\) is an operator on \(V\text{.}\) By the Fundamental Theorem of Algebra, we know that the minimal polynomial of \(T\) can be written as a product of linear factors:
\begin{equation*} p(x)=(x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_m)\text{,} \end{equation*}
which tells us that there is a basis \(\bcal\) in which \(\coords{T}{\bcal}\) is upper triangular. We will denote the vectors in \(\bcal\) as \(\bcal=\{\basis{\vvec}{n}\}\text{.}\)
Since \(V\) is an inner product space, we can apply the Gram-Schmidt algorithm to \(\bcal\) to form a new orthogonal basis \(\ccal\text{.}\) The vectors in \(\ccal\) will be denoted by \(\ccal=\{\basis{\wvec}{n}\}\) so that
\begin{align*} \wvec_1\amp=\vvec_1\\ \wvec_2\amp = \vvec_2 - \frac{\inner{\vvec_2}{\wvec_1}}{\inner{\wvec_1}{\wvec_1}}\\ \wvec_3\amp = \vvec_3 - \frac{\inner{\vvec_3}{\wvec_1}}{\inner{\wvec_1}{\vvec_1}} -\frac{\inner{\vvec_3}{\wvec_2}}{\inner{\wvec_2}{\vvec_2}} \end{align*}
and so forth. We can rearrange these expressions so that
\begin{align*} \vvec_1\amp=\wvec_1\\ \vvec_2\amp = \wvec_2 + \frac{\inner{\vvec_2}{\wvec_1}}{\inner{\wvec_1}{\wvec_1}}\\ \vvec_3\amp = \wvec_3 + \frac{\inner{\vvec_3}{\wvec_1}}{\inner{\wvec_1}{\vvec_1}} +\frac{\inner{\vvec_3}{\wvec_2}}{\inner{\wvec_2}{\vvec_2}}\text{.} \end{align*}
In other words, the change of coordinates matirx \(\coords{I}{\ccal,\bcal}\) is upper triangular, which implies that
\begin{equation*} \coords{T}{\ccal} = \coords{I}{\ccal,\bcal} \coords{T}{\bcal} \coords{I}{\bcal,\ccal} \end{equation*}
is upper triangular.
We obtain an orthonormal basis by setting \(\uvec_j=\frac{\wvec_j}{\len{\wvec_j}}\text{.}\) Since this change of coordinates matrix is diagonal, we obtain the following result.
This result is sometimes expressed in terms of matrices. We earlier considered orthogonal matrices, which are real matrices whose columns form an orthonormal basis. If the matrix is complex, such a matrix is called unitary.

Definition 2.2.2.

A complex \(n\times n\) matrix \(U\) whose columns form an orthonormal basis for \(\complex^n\) is called unitary. Such a matrix satisfies \(U^*U=UU^*=I\text{.}\)
We can now restate the Schur decomposition in terms of unitary matrices.

Subsection 2.2.2 Self-adjoint operators

When \(V\) and \(W\) are inner product spaces, a linear transformation \(T:V\to W\) has an adjoint \(T^*:W\to V\) as introduced in Subsection 1.3.3. When expressed in terms of orthonormal bases for \(V\) and \(W\text{,}\) the matrix associated to \(T^*\) is the conjugate transpose of the matrix associated to \(T\text{.}\) Or when the vector spaces are real, the matrices are simply the transpose of one another.
We will now consider operators \(T:V\to V\) on an inner product space \(V\) that are self-adjoint.

Definition 2.2.4.

We say that operator \(T\) on an inner product space \(V\) is self-adjoint if \(T=T^*\text{.}\)

Proof.

By the Schur decomposition 2.2.1, we know that there is an orthonormal basis \(\bcal\) for which \(\coords{T}{\bcal} = A\) is upper triangular. However, since \(T=T^*\text{,}\) we also know that \(A=\conj{A}^T\text{,}\) which says that \(A\) is diagonal with real entries on the diagonal.
For real inner product spaces, self-adjoint operators are represented by symmetric matrices.

Proof.

By the Fundamental Theorem of Linear Maps 1.2.13, we only need to show that \(\nul(T)=\{\zerovec\}\text{.}\) Therefore, we suppose that \(\vvec\) is a nonzero vector and consider
\begin{equation*} \begin{aligned} \inner{(T^2+bT+cI)\vvec}{\vvec} \amp = \inner{T^2\vvec}{\vvec}+ \inner{bT\vvec}{\vvec}+c\inner{\vvec}{\vvec} \\ \amp = \inner{T\vvec}{T\vvec}+ b\inner{T\vvec}{\vvec}+c\inner{\vvec}{\vvec}\\ \amp = \len{T\vvec}^2 + b\inner{T\vvec}{\vvec}+c\len{\vvec}^2 \\ \amp \geq \len{T\vvec}^2 - b\len{T\vvec}\len{\vvec} + c\len{\vvec} \\ \amp \geq \left(\len{T\vvec} - \frac b2\len{\vvec}\right)^2 + \left(c-\frac{b^2}{4}\right)\len{\vvec}^2 \\ \amp \gt 0 \\ \end{aligned}\text{.} \end{equation*}
This shows that \((T^2+bT+cI)\vvec\) is also nonzero so the operator is an isomorphism.
If \(\field=\complex\) and \(T\) is an operator on \(V\text{,}\) we know from the Theorem 1.4.5 that the minimal polynomial of \(T\) has the form
\begin{equation*} p(x)=(x-\lambda_1)(x-\lambda_2)\ldots(x-\lambda_m) \end{equation*}
where each \(\lambda_j\in\complex\text{.}\) If \(\field=\real\) and \(T\) is a self-adjoint operator on \(V\text{,}\) we can reach a similar conclusion.

Proof.

We know that the minimal polynomial has the form
\begin{equation*} p(x)=(x-\lambda_1)\ldots(x-\lambda_m)(x^2+b_1x+c_1)\ldots (x^2+b_nx+c_n) \end{equation*}
where \(b_i^2 \lt 4c_i\) for each \(i\text{.}\) However, since \(p(T)=0\text{,}\) we know that
\begin{equation*} p(T)=(T-\lambda_1)\ldots(T-\lambda_m)(T^2+b_1T+c_1I)\ldots (T^2+b_nT+c_nI)\text{.} \end{equation*}
If \(n\gt 0\text{,}\) then \(T^2+b_1T+c_1I\) is invertible by Lemma 2.2.6. If we multiply \(p(T)\) by its inverse, we obtain another polynomial \(q\) of smaller degree for which \(q(T)=0\text{.}\)
Since the minimal polynomial \(p\) is the polynomial having the smallest degree among all polynomials for which \(p(T)=0\text{,}\) we conclude that \(m=0\) and therefore
\begin{equation*} p(x)=(x-\lambda_1)\ldots(x-\lambda_m)\text{.} \end{equation*}

Proof.

By Theorem 2.1.3 and Proposition 2.2.7, we know that there is a basis \(\bcal\) of \(V\) for which the matrix associated to \(T\) is upper triangular. As before, we apply the Gram-Schmidt algorithm to obtain an orthonormal basis \(\ccal\) and note that the change of coordinates matrix is upper triangular. Therefore,
\begin{equation*} \coords{T}{\ccal} = \coords{I}{\ccal,\bcal}\coords{T}{\bcal} \coords{I}{\bcal,\ccal} \end{equation*}
is also upper triangular.
However, if \(A=\coords{T}{\ccal}\) is this matrix, we know that \(A^T=A\) since \(T=T^*\) is self-adjoint. Therefore, \(A\) is diagonal.
In terms of matrices, this has the more familiar form: