In our earlier studies, the dot product helped us develop a richer, geometric perspective on some key ideas. In particular, we could use the dot product to detect when vectors are orthogonal, and this led to important ideas, such as least squares and the singular value decomposition.
As we expand our study to consider more general vector spaces, we would like to generalize the dot product so that it applies more generally. This leads us to inner products.
Things are a little different when we are using complex numbers. If \(z\) is a general complex number, then \(z^2\) is not guaranteed to be real, much less nonnegative. To preserve the positivity condition above, remember that the complex conjugate is defined by
With this definition, the three properties above still hold except that the symmetry condition is modified to \(\wvec\cdot\vvec =
\conj{\vvec\cdot\wvec}\text{.}\)
In fact, this is true if \(V=\real^n\) as well. If \(x\) is real, then \(\conj{x} = x\) so the conjugate symmetry condition is the same as the symmetry condition above.
This may seem strange when you first see it, but it is just an extension of the usual dot product in some sense. For instance, think of a three-dimensional vector as a function from the set \(\{-1,0,1\}\) into \(\real\text{.}\) The dot product between two vectors is then
so that we multiply the the value of \(v\) and \(\conj{\wvec}\) at each point and add. If we interpret the integral as an infinite sum, this is what the inner product defined above is doing.
Suppose \(V=\field^{m,n}\text{,}\) the vector space of \(m\times
n\) matrices with entries in \(\field\text{.}\) If \(A\) is such a matrix, we define \(A^*\) to be its conjugate transpose. That is, \(A^*=\conj{A^T}\text{.}\) Then
Since an inner product is the same concept as the dot product extended to vector spaces, we have access to many similar concepts, such as orthogonality.
If \(V=\poly\text{,}\) the set of all polynomials, with the inner product given in ExampleΒ 1.3.3, then \(p(x)=x-x^3\) is orthogonal to \(q(x)=x^2+7x^8\text{.}\) This follows because each term in \(p(x)\conj{q(x)}\) is an odd power of \(x\) whose integral on the interval \([-1,1]\) will be zero by symmetry.
In an inner product space, we say that \(\basis{\vvec}{m}\) is an orthogonal set if each vector is nonzero and each pair of vectors is orthogonal to one another.
Suppose that \(\basis{\wvec}{m}\) is an orthogonal set in an inner product space \(V\) and that \(\bvec\) is a vector in \(V\text{.}\) The closest vector in \(W\) to \(\bvec\) is called the orthogonal projection of \(\bvec\) onto \(W\) and is given by
Notice that, by linearity, if a vector \(\uvec\) is orthogonal to each \(\wvec_j\text{,}\) then it is orthogonal to every vector in \(W\text{.}\) This is because any vector in \(W\) is a linear combination of \(\basis{\wvec}{m}\) so that
Now suppose that \(\wvec\) is any other vector in \(W\text{.}\) Then \(\bhat - \wvec\) is in \(W\) and hence orthogonal to \(\bvec-\bhat\text{.}\) Therefore,
Notice that a vector space \(V\) is a subspace of itself so the previous proposition implies that every finite dimensional vector space has an orthonormal basis.
Also, remember that any linearly independent set in \(V\) can be extended to a basis of \(V\) by PropositionΒ 1.1.34. If we begin with an orthonormal set of vectors in \(V\text{,}\) we can extend it to a basis of \(V\text{,}\) and apply the Gram-Schmidt algorithm to the added basis vectors to obtain an orthonormal basis of \(V\text{.}\) In other words,
where \(\theta\) is the angle between \(\vvec\) and \(\wvec\text{.}\) Now that we are working with more general inner product spaces, the idea of the angle between two vectors might not have much meaning. However, since \(|\cos\theta| \leq 1\text{,}\) we have
We know that \(\what\) and \(\wvec-\what\) are orthogonal by the construction of the orthogonal projection. Therefore, the Pythagorean theorem applies.
Subsection1.3.4The adjoint of a linear transformation
We suppose now that \(V\) and \(W\) are inner product spaces over a field \(\field\text{.}\) If \(T:V\to W\) is a linear transformation, we can define its adjoint \(T^*\) through the following relationship
If \(\phi = 0\text{,}\) then we can take \(\uvec=0\) as well. This choice of \(\uvec\) is unique because if \(\wvec\) is a nonzero vector with \(\phi(\vvec)=\inner{\vvec}{\wvec} = 0\) for all \(\vvec\text{,}\) then
Suppose now that \(\phi\neq 0\text{,}\) which means that there is a vector \(\vvec\) such that \(\phi(\vvec) \neq
0\text{.}\) Therefore, \(\phi\) is surjective and \(\range(\phi)=\field\text{.}\)
Choose an orthonormal basis \(\basis{\vvec}{n-1}\) for \(\nul(\phi)\text{.}\) We know by PropositionΒ 1.3.16 that we can add a vector \(\wvec\) to obtain an orthonormal basis. Let \(\uvec=\conj{\phi(\wvec)}\wvec\text{.}\)
for every vector \(\vvec\text{.}\) In particular, we have \(\inner{\vvec}{\uvec_1-\uvec_2} = 0\) for every \(\vvec\) including \(\vvec=\uvec_1-\uvec_2\text{.}\) Therefore,
If \(\basis{\uvec}{n}\) is an orthonormal basis for \(V\) and \(\phi:V\to\field\) is a functional on \(V\text{,}\) then the vector \(\uvec\) given in the Riesz Representation TheoremΒ 1.3.18 is given by
Suppose that \(\phi:V\to\field\) and that \(\uvec = c_1\uvec_1+\ldots+c_n\uvec_n\) is the vector given in the Riesz Representation Theorem. Notice that
There are a number of things implied by this definition so we need to check that they are satisfied. The following proposition will take care of this for us.
We first need to establish that \(T^*(\wvec)\) is a vector in \(V\) for every \(\wvec\) in \(V\text{.}\) For a fixed \(\wvec\) in \(W\text{,}\) define the linear transformation \(\phi:V\to \field\) by
By PropositionΒ 1.3.18, we know there is a vector \(\uvec\) in \(V\) such that \(\phi(\vvec) =
\inner{\vvec}{\uvec}\text{.}\) We define \(T^*(\wvec) =
\uvec\text{,}\) which gives
We have now defined a function \(T^*:W\to V\) such that \(\inner{T(\vvec)}{\wvec} = \inner{\vvec}{T^*(\wvec)}\) for all \(\vvec\) and \(\wvec\text{.}\)
Finally, we need to show that \(T^*\) is a linear transformation by verifying that \(T^*\) satisfies the two linearity properties. Suppose that \(\wvec_1\) and \(\wvec_2\) are vectors in \(V\text{.}\) Then
We now relate the matrices associated to \(T\) and \(T^*\) with respect to an orthonormal basis of \(V\text{.}\) As before, we use \(\uvec_1,\ldots,\uvec_n\) to denote an orthonormal basis of \(V\text{.}\)
Suppose that \(V\) and \(W\) are inner product spaces with orthonormal bases \(\bcal\) and \(\ccal\text{,}\) respectively. If \(T:V\to W\) is a linear transformation, \(A=\coords{T}{\ccal,\bcal}\text{,}\) and \(B=\coords{T^*}{\bcal,\ccal}\text{,}\) then
If the underlying field \(\field=\real\text{,}\) then the matrix associated to the adjoint \(T^*\) is just the transpose of the matrix associated to \(T\text{.}\) In other words, \(B=A^T\) in the notation of PropositionΒ 1.3.22.