In our earlier studies, we introduced the dot product to introduce a richer geometric perspective on some key ideas. In particular, we could use the dot product to detect when vectors are orthogonal, and this led to many simplifications. For instance, the inverse of a matrix whose columns form an orthonormal basis of \(\real^n\) is just the transpose of that matrix.
As we expand our study to consider more general vector spaces, we need to introduce a concept like the dot product for vector spaces. This leads us to the concept of an inner product.
Subsection1.3.1Inner products
On the vector space \(\real^n\text{,}\) we introduced the dot product between two vectors:
Things are a little different when we are using complex numbers. If \(z\) is a general complex number, then \(z^2\) is not guaranteed to be real, much less nonnegative. To preserve the positivity condition above, remember that the complex conjugate is defined by
With this definition, the three properties above still hold except that the symmetry condition is modified to \(\vvec\cdot\wvec =
\conj{\wvec\cdot\vvec}\text{.}\)
Definition1.3.1.
If \(V\) is a vector space, we call \(\inner{}{}\) an inner product provided that
Positivity.
\(\inner{\vvec}{\vvec}\geq 0\) and \(\inner{\vvec}{\vvec} = 0\) if and only if \(\vvec=0\text{.}\)
If \(V=\complex^n\text{,}\) then \(\inner{\vvec}{\wvec} =
\vvec\cdot\wvec\) is an inner product.
In fact, this is true if \(V=\real^n\) as well. If \(x\) is real, then \(\conj{x} = x\) so the conjugate symmetry condition is the same as the symmetry condition above.
Example1.3.3.
If \(\poly\) is the vector space of all polynomials over \(\field\text{,}\) then
This may seem strange when you first see it, but it is just an extension of the usual dot product in some sense. For instance, think of a three-dimensional vector as a function from the set \(\{-1,0,1\}\) into \(\real\text{.}\) The dot product between two vectors is then
so that we multiply the the value of \(v\) and \(\conj{\wvec}\) at each point and add. If we interpret the integral as an infinite sum, this is what the inner product defined above is doing.
Example1.3.4.
Suppose \(V=\field^{m,n}\text{,}\) the vector space of \(m\times
n\) matrices. If \(A\) is such a matrix, we define \(A^*\) to be its conjugate transpose. That is, \(A^*=\conj{A^T}\text{.}\) Then
for all vectors \(\vvec_1\) and \(\vvec_2\text{,}\) we say that \(T\) is an isometry of vector spaces.
Subsection1.3.2Orthogonality
Since an inner product is the same concept as the dot product extended to vector spaces, we have access to many similar concepts, such as orthogonality.
Definition1.3.8.
Two vectors \(\vvec\) and \(\wvec\) in an inner product space are orthogonal if \(\inner{\vvec}{\wvec} =
0\text{.}\)
Example1.3.9.
If \(V=\poly\text{,}\) the set of all polynomials, with the inner product given in Example 1.3.3, then \(p(x)=x-x^3\) is orthogonal to \(q(x)=x^2+7x^8\text{.}\) This follows because each term in \(p(x)\conj{q(x)}\) is an odd power of \(x\) whose integral on the interval \([-1,1]\) will be zero by symmetry.
More generally, any polynomial whose terms are all of odd degree is orthogonal to any polynomials whose terms are all of even degree.
Proposition1.3.10.Pythagorean theorem.
If \(\vvec\) and \(\wvec\) are two orthogonal vectors in an inner product space, then
In an inner product space, we say that \(\basis{\vvec}{m}\) is an orthogonal set if each vector is nonzero and each pair of vectors is orthogonal to one another.
Proposition1.3.12.
In an inner product space, an orthogonal set is linearly independent.
Proof.
Suppose that \(\basis{\vvec}{m}\) is an orthogonal set and that
From this, we conclude that an orthogonal set forms a basis for a subspace of the inner product space.
Proposition1.3.13.Projection formula.
Suppose that \(\basis{\wvec}{m}\) is an orthogonal set in an inner product space \(V\) and that \(\bvec\) is a vector in \(V\text{.}\) The closest vector in \(W\) to \(\bvec\) is called the orthogonal projection of \(\bvec\) onto \(W\) and is given by
that we frequently used in our previous classes and its found by the same argument.
We first find the vector \(\bhat\) so that \(\bvec-\bhat\) is orthogonal to \(W\) and then explain why it is the closest vector.
Notice that, by linearity, if a vector \(\uvec\) is orthogonal to each \(\wvec_j\text{,}\) then it is orthogonal to every vector in \(W\text{.}\) This is because any vector in \(W\) is a linear combination of \(\basis{\wvec}{m}\) so that
which gives the expression for \(\bhat\) in the statement of the proposition.
Now suppose that \(\wvec\) is any other vector in \(W\text{.}\) Then \(\bhat - \wvec\) is in \(W\) and hence orthogonal to \(\bvec-\bhat\text{.}\) Therefore,
and so on. This produces an orthogonal basis for \(W\) since, at every step, \(\laspan{\basis{\wvec}{j}} =
\laspan{\basis{\vvec}{j}}\text{.}\)
Finally, we define \(\uvec_j =
\frac{\wvec_j}{\len{\wvec_j}}\) to obtain an orthonormal basis for \(W\text{.}\)
Notice that a vector space \(V\) is a subspace of itself so the previous proposition implies that every finite dimensional subspace has an orthonormal basis.
Also, remember that any linearly independent set in \(V\) can be extended to a basis of \(V\) by Proposition 1.1.31. If we begin with an orthonormal set of vectors in \(V\text{,}\) we can extend it to a basis of \(V\text{,}\) and apply the Gram-Schmidt algorithm to the added basis vectors to obtain an orthonormal basis of \(V\text{.}\) In other words,
Proposition1.3.16.
Any orthonormal set in \(V\) can be extended to an orthonormal basis for \(V\text{.}\)
Subsection1.3.3The adjoint of a linear transformation
We suppose now that \(V\) and \(W\) are inner product spaces over a field \(\field\text{.}\) If \(T:V\to W\) is a linear transformation, we can define its adjoint \(T^*\) through the following relationship
If \(\phi = 0\text{,}\) then we can take \(\uvec=0\) as well.
So suppose that \(\phi\neq 0\text{,}\) which means that there is a vector \(\vvec\) such that \(\phi(\vvec) \neq
0\text{.}\) Therefore, \(\phi\) is onto and \(\range(\phi)=\field\text{.}\)
Choose an orthonormal basis \(\basis{\vvec}{n-1}\) for \(\nul(\phi)\text{.}\) We know by Proposition 1.3.16 that we can add a vector \(\wvec\) to obtain an orthonormal basis. Let \(\uvec=\phi(\wvec)\wvec\text{.}\)
for every vector \(\vvec\text{.}\) In particular, we have \(\inner{\vvec}{\uvec_1-\uvec_2} = 0\) for every \(\vvec\) including \(\vvec=\uvec_1-\uvec_2\text{.}\) Therefore,
There are a number of things implied by this definition so we need to check that they are satisfied. The following proposition will take care of this for us.
Proposition1.3.19.
The adjoint \(T^*:W\to V\) is a linear transformation.
Proof.
We first need to establish that \(T^*(\wvec)\) is a vector in \(V\) for every \(\wvec\) in \(V\text{.}\) For a fixed \(\wvec\) in \(W\text{,}\) define the linear transformation \(\phi:V\to \field\) by
By Proposition 1.3.17, we know there is a vector \(\uvec\) in \(V\) such that \(\phi(\vvec) =
\inner{\vvec}{\uvec}\) so we define \(T^*(\wvec) =
\uvec\text{,}\) which gives
We have now defined a function \(T^*:W\to V\) such that \(\inner{T(\vvec)}{\wvec} = \inner{\vvec}{T^*(\wvec)}\) for all \(\vvec\) and \(\wvec\text{.}\) We just need to show that \(T^*\) is a linear transformation.
We need to show that \(T^*\) satisfies the two linearity properties. Suppose that \(\wvec_1\) and \(\wvec_2\) are vectors in \(V\text{.}\) Then
In the same way, we see that \(T^*(s\wvec) =
sT^*(\wvec)\text{,}\) which verifies that \(T^*\) is an operator on \(T\text{.}\)
We now relate the matrices associated to \(T\) and \(T^*\) with respect to an orthonormal basis of \(V\text{.}\) As before, we use \(\uvec_1,\ldots,\uvec_n\) to denote an orthonormal basis of \(V\text{.}\)
Proposition1.3.20.
Suppose that \(V\) and \(W\) are inner product spaces with orthonormal bases \(\bcal\) and \(\ccal\text{,}\) respectively. If \(T:V\to W\) is a linear transformation, \(A=\coords{T}{\ccal,\bcal}\text{,}\) and \(B=\coords{T^*}{\bcal,\ccal}\text{,}\) then
If the underlying field \(\field=\real\text{,}\) then the matrix associated to the adjoint \(T^*\) is just the transpose of the matrix associated to \(T\text{.}\) In other words, \(B=A^T\) in the notation of Proposition 1.3.20.