Linear transformations

Section 1.2 Linear transformations

We earlier viewed an \(m\times n\) matrix \(A\) as defining a matrix transformation \(T:\real^m\to\real^n\) by \(T(\vvec)=A\vvec\text{.}\) Due to the linearity of matrix multiplication, this meant that

\begin{align*} T(s\vvec)\amp=sT(\vvec)\\ T(\vvec_1+\vvec_2)\amp =T(\vvec_1)+T(\vvec_2)\text{.} \end{align*}

Now that we are working with vector spaces, we can define linear transformations, which are functions between vector spaces that satisfy these properties.

🔗

Subsection 1.2.1 Linear transformations

Given two vector spaces, \(V\) and \(W\text{,}\) we can define a linear transformation between them by generalizing our earlier notion of matrix transformation.

🔗

Definition 1.2.1.

If \(V\) and \(W\) are vector spaces, then a linear transformation is a function \(T:V\to W\) such that, for every scalar \(s\) and pair of vectors \(\vvec_1,\vvec_2\in V\text{,}\) we have

\begin{align*} T(s\vvec)\amp=sT(\vvec)\\ T(\vvec_1+\vvec_2)\amp =T(\vvec_1)+T(\vvec_2)\text{.} \end{align*}

🔗

Example 1.2.2.

Suppose that \(A=\begin{bmatrix} 2 \amp 0 \\ -1 \amp 2 \\ 2 \amp 1 \\ \end{bmatrix} \text{.}\) Then \(T:\real^2\to\real^3\) defined by \(T(\vvec)=A\vvec\) is a linear transformation.

🔗

This follows because matrix multiplication is a linear operation:

\begin{align*} T(s\vvec) \amp = A(s\vvec) = sA\vvec = sT(\vvec)\\ T(\vvec_1+\vvec_2) \amp = A(\vvec_1+\vvec_2) = A\vvec_1 + A\vvec_2 = T(\vvec_1) +T(\vvec_2) \end{align*}

🔗

Example 1.2.3.

Suppose that \(V=\fcal\text{,}\) the set of functions \(f:\real\to\complex\text{.}\) Then \(T(f)=\threevec{f(-3)}{f(0)}{f(3)}\) is a linear transformation \(T:\fcal\to \complex^3\text{.}\)

🔗

To see this, we need to remember how scalar multiplication and vector addition work in \(V\text{.}\) If \(s\) is a scalar and \(f_1\) and \(f_2\) are functions, then

\begin{align*} (sf)(x) \amp = s(f(x))\\ (f_1+f_2)(x) \amp = f_1(x) + f_2(x) \end{align*}

Therefore,

\begin{align*} T(sf) \amp = \threevec{(sf)(-3)}{(sf)(0)}{(sf)(3)} = \threevec{sf(-3)}{sf(0)}{sf(3)} = s\threevec{f(-3)}{f(0)}{f(3)} = sT(f)\\ T(f_1+f_2) \amp = \threevec{(f_1+f_2)(-3)} {(f_1+f_2)(0)}{(f_1+f_2)(3)} = \threevec{f_1(-3)+f_2(-3)}{f_1(0)+f_2(0)}{f_1(3)+f_2(3)}\\ \amp = \threevec{f_1(-3)}{f_1(0)}{f_1(3)}+\threevec{f_2(-3)}{f_2(0)} {f_2(3)}=T(f_1)+T(f_2) \end{align*}

🔗

Example 1.2.4.

Suppose that \(V=\pbb_3\) and \(W=\pbb_2\text{.}\) If \(p\) is a polynomial in \(\pbb_3\text{,}\) define the function \(T(p) = p'\) where \(p'\) is the derivative of \(p\text{.}\) Two common rules of differentiation, the constant multiplier rule and the addition rule, imply that \(T\) is a linear transformation.

🔗

Example 1.2.5.

If \(V=\pbb_3\) and \(W=\real\text{,}\) then \(T(p)=\int_0^5 p(x)~dx\) is a linear transformation.

🔗

Example 1.2.6.

Suppose that \(V\) is the vector space of \(3\times3\) matrices. Then \(T:V\to\real\) by \(T(A)=\det(A)\) is not a linear transformation because \(T(sA)=s^3T(A)\text{.}\)

🔗

Definition 1.2.7.

Given two vector spaces, \(V\) and \(W\text{,}\) the set of linear transformations \(T:V\to W\) will be denoted as \(L(V,W)\text{.}\)

🔗

Notation.

While a linear transformation \(T:V\to W\) is a function, we will frequently write \(T\vvec\text{,}\) without parentheses, when we mean \(T(\vvec)\text{.}\) This is similar to how we often write \(\sin x\) rather than \(\sin(x)\) in other courses.

🔗

Subsection 1.2.2 The null space and range

A linear transformation \(T:V\to W\) creates a subspace of \(V\) and a subspace of \(W\text{.}\)

🔗

Definition 1.2.8.

If \(T:V\to W\) is a linear transformation, we define the null space and range of \(T\) to be

\begin{align*} \nul(T)\amp = \{\vvec~|~T\vvec=0\}\subset V\\ \range(T)\amp = \{\wvec~|~\wvec=T\vvec \text{ for some }\vvec\}\subset W\text{.} \end{align*}

🔗

In our earlier linear algebra courses, we considered the null space and column space of a matrix. The null space and range of a linear transformation is the same concept generalized to vector spaces.

🔗

Example 1.2.9.

Suppose that \(A\) is a \(10\times 17\) matrix and consider the linear transformation \(T:\real^{17}\to\real^{10}\text{.}\) Then \(\nul(A)\) is the set of solutions to the equation \(A\xvec=T(\xvec)=\zerovec\text{,}\) which is the same as the null space \(\nul(T)\text{.}\)

🔗

Similarly, the column space is the set of vectors \(\bvec\) for which \(A\xvec = \bvec\) is consistent. In other words, \(\bvec\) is in \(\col(A)\) if and only if there is a vector \(\xvec\) such that \(T(\xvec)=A\xvec = \bvec\text{.}\) This is precisely the definition of \(\range(T)\text{.}\)

🔗

Example 1.2.10.

Consider the linear transformation \(T:\real^4\to\real^3\) defined by the matrix

\begin{equation*} A=\begin{bmatrix} 2 \amp -1 \amp 2 \amp 3 \\ 1 \amp 0 \amp 0 \amp 2 \\ -2 \amp 2 \amp -4 \amp -2 \\ \end{bmatrix} \sim \begin{bmatrix} 1 \amp 0 \amp 0 \amp 2 \\ 0 \amp 1 \amp -2 \amp 1 \\ 0 \amp 0 \amp 0 \amp 0 \\ \end{bmatrix}\text{.} \end{equation*}

The null space is the set of vectors for which \(T\vvec=A\vvec=\zerovec\text{,}\) which we see is the subspace of \(\real^4\) spanned by \(\fourvec0210\) and \(\fourvec{-2}{-1}01\text{.}\)

🔗

Similarly, \(\range(T)\) is the subspace of \(\real^3\) having a basis given by \(\threevec21{-2}\) and \(\threevec{-1}02\text{.}\)

🔗

Example 1.2.11.

Suppose that \(V=\pbb_3\) and \(W=\real^2\) and that \(T:V\to\real^2\) where \(T(p)=\twovec{p(0)}{p'(0)}\text{.}\) A general polynomial in \(V\) has the form

\begin{equation*} p(x)=a_3x^3+a_2x^2+a_1x+a_0 \end{equation*}

so that \(T(p)=\twovec{a_0}{a_1}\text{.}\) Therefore, \(\nul(T)\) is the set of polynomials for which \(a_0=a_1=0\) so that \(p(x)=a_3x^3+a_2x^2\text{.}\) This shows that \(\nul(T) = \laspan{x^2,x^3}\text{.}\)

🔗

We also have that \(\range(T) = \real^2\)

🔗

Proposition 1.2.12.

If \(T:V\to W\text{,}\) then \(\nul(T)\) is a subspace of \(V\) and \(\range(T)\) is a subspace of \(W\text{.}\)

🔗

Proof.

Suppose that \(\vvec_1\) and \(\vvec_2\) are in \(\nul(T)\text{.}\) Then we have

\begin{align*} T(s\vvec_1) = \amp s(T\vvec_1) = s0 = 0\\ T(\vvec_1+\vvec_2) = \amp T(\vvec_1) + T(\vvec_2) = 0 + 0 = 0. \end{align*}

This shows that \(\nul(T)\) is closed under scalar multiplication and vector addition and is therefore a subspace of \(V\text{.}\)

🔗

Now suppose the \(\wvec_1\) and \(\wvec_2\) are in \(\range(T)\text{.}\) We know that there are vectors \(\vvec_1\) and \(\vvec_2\) in \(V\) such that \(T(\vvec_1) = \wvec_1\) and \(T(\vvec_2)=\wvec_2\text{.}\) Therefore,

\begin{align*} s\wvec_1\amp = sT(\vvec_1) = T(s\vvec_1)\\ \wvec_1+\wvec_2 \amp = T(\vvec_1)+T(\vvec_2) = T(\vvec_1+\vvec_2). \end{align*}

This shows that \(\range(T)\) is closed under scalar multiplication and vector addition so \(\range(T)\) is a subspace of \(W\text{.}\)

🔗

The next proposition is extremely useful so we will refer to it as the “Fundamental Theorem of Linear Maps”.

🔗

Proposition 1.2.13. Fundamental Theorem of Linear Maps.

If \(V\) is a finite dimensional vector space and \(T:V\to W\) is a linear transformation, then

\begin{equation*} \dim \nul(T) + \dim \range(T) = \dim V\text{.} \end{equation*}

🔗

Proof.

Suppose that \(\uvec_1,\ldots,\uvec_j\) is a basis for \(\nul(T)\text{,}\) which we extend to a basis for \(V\) by adding vectors \(\vvec_1,\ldots,\vvec_k\text{.}\) We also define \(\wvec_i=T\vvec_i\text{.}\)

🔗

Given a vector \(\vvec\) in \(V\text{,}\) we can write

\begin{equation*} \vvec=a_1\uvec_1 + \ldots + a_j\uvec_j + b_1\vvec_1 + \ldots + b_k\vvec_k \end{equation*}

so that

\begin{align*} T\vvec\amp=a_1T\uvec_1 + \ldots + a_jT\uvec_j + b_1T\vvec_1 + \ldots + b_k\vvec_k\\ \amp=b_1\wvec_1+\ldots+b_k\wvec_k\text{.} \end{align*}

This shows that \(\wvec_1,\ldots,\wvec_k\) span \(\range(T)\text{.}\)

🔗

We also claim that \(\wvec_1,\ldots,\wvec_k\) forms a linearly independent set. Suppose that

\begin{align*} b_1\wvec_1 + \ldots b_k\wvec_k \amp = 0\\ b_1T\vvec_1 + \ldots b_kT\vvec_k \amp = 0\\ T(b_1\vvec_1+\ldots+b_k\vvec_k) \amp = 0\text{,} \end{align*}

which means that \(b_1\vvec_1+\ldots+b_k\vvec_k\) is in \(\nul(T)\) so that this vector is a linear combination of \(\uvec_1,\ldots,\uvec_j\text{.}\) In other words,

\begin{align*} b_1\vvec_1+\ldots+b_k\vvec_k \amp = a_1\uvec_1 + \ldots a_j\uvec_j\\ a_1\uvec_1 + \ldots + a_j\uvec_j - b_1\vvec_1-\ldots-b_k\vvec_k \amp = 0\text{.} \end{align*}

Since \(\uvec_1,\ldots,\uvec_j,\vvec_1,\ldots,\vvec_k\) is a basis for \(V\text{,}\) this set of vectors is linearly independent, which means that

\begin{equation*} a_1=\ldots=a_j=b_1=\ldots=b_k\text{.} \end{equation*}

Therefore, the set of vectors \(\basis{\wvec}{k}\) is linearly independent and hence a basis for \(\range(T)\text{.}\)

🔗

This shows that

\begin{equation*} \dim \nul(T) + \dim \range(T) = j + k = \dim V\text{.} \end{equation*}

🔗

Example 1.2.14.

The Fundamental Theorem of Linear Maps is familiar from our earlier course where we defined the rank \(r\) of a matrix to be its number of pivots. If \(A\) is an \(m\times n\) matrix and \(T:\real^n\to\real^m\) the linear transformation defined by \(T(\xvec) = A\xvec\text{,}\) we saw that

\begin{align*} \dim\col(A) \amp = r\\ \dim\nul(A) \amp = n - r\text{.} \end{align*}

Therefore, \(\dim\col(A) + \dim\nul(A) = n\text{,}\) which is an expression of the Fundamental Theorem of Linear Maps.

🔗

Example 1.2.15.

Revisiting Example 1.2.11, we had \(T:V\to W\) where \(V=\pbb_3\) and \(W=\real^2\text{.}\) We saw that \(\nul(T)= \laspan{x^2,x^3}\) so that \(\dim\nul(T) = 2\) and that \(\range(T) = \real^2\) so that \(\dim\range(T) = 2\text{.}\) Therefore,

\begin{equation*} \dim\nul(T) + \dim\range(T) = 2 + 2 = 4 = \dim V = \dim \pbb_3\text{.} \end{equation*}

🔗

Definition 1.2.16.

Suppose \(T:V\to W\text{.}\)

If \(\range(T)=W\text{,}\) we say that \(T\) is surjective. In particular, for every \(\wvec\) in \(W\text{,}\) there is a \(\vvec\) in \(V\) such that \(T\vvec = \wvec\text{.}\)
🔗

🔗
If \(\nul(T)=\{0\}\text{,}\) we say that \(T\) is injective. In this case, if \(T\vvec_1=T\vvec_2\text{,}\) then \(\vvec_1=\vvec_2\) since \(T(\vvec_1-\vvec_2) = 0\text{,}\) which means that \(\vvec_1-\vvec_2\) is in \(\nul(T)=\{0\}\text{.}\)
🔗

🔗

🔗

Example 1.2.17.

Once again, these are familiar notions. Suppose that \(A\) is an \(m\times n\) matrix that defines a linear transformation \(T:\real^n\to\real^m\text{.}\) Then \(T\) is injective if \(\nul(A) = \{\zerovec\}\text{.}\) This happens when the columns of \(A\) are linearly independent.

🔗

The transformation \(T\) is surjective if \(\col(A) = \real^m\text{,}\) which happens when the columns of \(A\) span \(\real^m\text{.}\)

🔗

Example 1.2.18.

The linear transformation \(T:\pbb\to\pbb\) defined by \(T(p)(x)=xp(x)\) is injective but not surjective because the constant polynomials are not in \(\range(T)\text{.}\)

🔗

Definition 1.2.19.

A linear transformation \(T:V\to W\) is called an isomorphism if \(T\) is both surjective and injective.

🔗

The following proposition follows immediately from the Fundamental Theorem of Linear Maps 1.2.13.

🔗

Proposition 1.2.20.

If \(V\) and \(W\) are finite dimensional vector spaces and \(T:V\to W\text{,}\) then

If \(T\) is surjective, then \(\dim V \geq \dim W\text{.}\)
🔗

🔗
If \(T\) is injective, then \(\dim V \leq \dim W\text{.}\)
🔗

🔗
If \(T\) is an isomorphism, then \(\dim V = \dim W\text{.}\)
🔗

🔗

🔗

Subsection 1.2.3 Vector space isomorphisms

Example 1.2.21.

Consider the linear transformation \(T:\pbb_2\to\real^3\) defined by \(T(p)=\threevec{p(0)}{p'(0)}{p''(0)}\text{.}\) If \(p(x)=a_0+a_1x+a_2x^2\text{,}\) then \(T(p) = \threevec{a_0}{a_1}{2a_2}\text{.}\) This shows that \(\nul(T) = \{0\}\) and \(\range(T)=\real^3\text{.}\) Therefore, \(T\) is a vector space isomorphism.

🔗

Example 1.2.22.

If \(V\) is a vector space, then \(I:V\to V\) defined by \(I(\vvec) = \vvec\) is a linear transformation called the identity transformation.

🔗

Suppose that \(T:V\to W\) is an isomorphism, then every vector \(\wvec\) in \(W\) has a vector \(\vvec\) in \(V\) such that \(T(\vvec) = \wvec\text{.}\) In fact, there is exactly once such vector since if \(T(\vvec_1)=T(\vvec_2) = \wvec\text{,}\) we know that \(\vvec_1=\vvec_2\) because \(T\) is injective. In this case, we can define a function \(S:W\to V\) where \(S(\wvec)\) is the vector \(\vvec\) for which \(T(\vvec) = \wvec\text{.}\)

🔗

Notice that \(S:W\to V\) is a linear transformation. For instance, if \(T(\vvec) = \wvec\text{,}\) then \(T(s\vvec) = s\wvec\text{,}\) which says that

\begin{equation*} S(s\wvec) = s\vvec = sS(\wvec)\text{.} \end{equation*}

In the same way, we have

\begin{equation*} S(\wvec_1+\wvec_2) = \vvec_1+\vvec_2 = S(\wvec_1)+S(\wvec_2)\text{.} \end{equation*}

🔗

Therefore, we have the following proposition.

🔗

Proposition 1.2.23.

If \(T:V\to W\) is an ismorphism, there is a linear transformation \(S:W\to V\) such that \(ST=I_V\text{,}\) the identity transformation on \(V\text{,}\) and \(TS=I_W\text{,}\) the identity transformation on \(W\text{.}\) We call \(S\) the inverse of \(T\) and write \(S=T^{-1}\text{.}\)

🔗

Example 1.2.24.

If \(A\) is an \(m\times m\) matrix and \(T:\real^m\to\real^m\) by \(T(\xvec)=A\xvec\text{,}\) then \(T\) is an isomorphism precisely when \(A\) is invertible. In this case, \(T^{-1}(\xvec) = A^{-1}\xvec\text{.}\)

🔗

Proposition 1.2.25.

If \(V\) is a finite dimensional vector space of dimension \(n\) over the field \(\field\text{,}\) then there is an isomorphism \(T:\field^n\to V\text{.}\)

🔗

Proof.

We choose a basis \(\basis{\vvec}{n}\) and define

\begin{equation*} T\left(\cthreevec{c_1}{\vdots}{c_n}\right) = c_1\vvec_1 + c_2\vvec_2 + \ldots + c_n\vvec_n\text{.} \end{equation*}

By the linear independence of the basis, we see that \(T\) is injective. Since the span of the basis vectors is \(V\text{,}\) we see that \(T\) is surjective.

🔗

The term isomorphism means “having the same shape or structure.” In other words, isomorphic vector spaces have the same structure. In our earlier courses, we considered only the vector spaces \(\real^n\text{.}\) The previous proposition, Proposition 1.2.25, shows us that every finite dimensional real vector space has the same structure as \(\real^n\text{.}\) This means that, technically speaking, we were also studying finite dimensional real vector spaces at the same time.

🔗

Notice, however, that the isomorphism in Proposition 1.2.25 depends on a choice of basis. If two people choose different bases, then they will produce different isomorphisms. In fact, as we move forward, some of our work will be motivated by choosing a basis that creates a particularly nice isomorphism. Our next discussion of matrices will lay that foundation.

🔗

Subsection 1.2.4 Representing linear transformations with matrices

Proposition 1.2.25 says that every finite dimensional vector space is essentially the same as \(\field^n\text{.}\) Therefore, we are able to represent elements in a vector space as more typical vectors in \(\field^n\) and linear transformations as matrices. Let us make this more clear now.

🔗

Suppose that we have a basis \(\bcal=\{\basis{\vvec}{n}\}\) for a finite dimensional vector space \(V\text{.}\) If \(\vvec\) is an element of \(V\text{,}\) then we can uniquely write

\begin{equation*} \vvec = c_1\vvec_1 + c_2\vvec_2 + \ldots + c_n\vvec_n\text{.} \end{equation*}

As shorthand, we will write

\begin{equation*} \coords{\vvec}{\bcal} = \cfourvec{c_1}{c_2}{\vdots}{c_n}\text{.} \end{equation*}

This should be familiar from our earlier work

understandinglinearalgebra.org/sec-bases.html#sec-bases-4

when we used a basis of \(\real^m\) to form a new coordinate system.

🔗

Example 1.2.26.

Consider the vector space \(V=\pbb_2\) with the basis \(\bcal=\{1,x,x^2\}\text{.}\) Then we have

\begin{equation*} \coords{a_0+a_1x+a_2x^2}{\bcal} = \cthreevec{a_0}{a_1}{a_2}\text{.} \end{equation*}

We may think of this as a coordinate system in the vector space of polynomials.

🔗

In a similar way, we can represent linear transformations using matrices. Suppose that \(T:V\to W\) is a linear transformation and that we have a basis \(\bcal=\{\vvec_1,\ldots,\vvec_n\}\) for \(V\) and a basis \(\ccal=\{\wvec_1,\ldots,\wvec_m\}\) for \(W\text{.}\) We then have

\begin{equation*} T(\vvec_j) = A_{1,j}\wvec_1 + A_{2,j}\wvec_2 + \ldots + A_{m,j}\wvec_m \text{,} \end{equation*}

which defines an \(m\times n\) matrix \(A\text{.}\) In the same way we denoted the coordinates of a vector in terms of a basis, we can denote the matrix of the linear transformation \(\coords{T}{\ccal,\bcal} = A\text{.}\) Notice that

\begin{equation*} \coords{T(\vvec_j)}{\ccal} = \cfourvec{A_{1,j}}{A_{2,j}}{\vdots}{A_{n,j}} \end{equation*}

meaning that the columns of \(\coords{T}{\ccal,\bcal}\) are the coordinates \(\coords{T(\vvec_j)}{\ccal}\text{:}\)

\begin{equation*} \coords{T}{\ccal,\bcal} = \left[ \begin{array}{cccc} \coords{T(\vvec_1)}{\ccal} \amp \coords{T(\vvec_2)}{\ccal} \amp \ldots \coords{T(\vvec_m)}{\ccal} \end{array} \right]\text{.} \end{equation*}

🔗

At first glance, this notation may seem a little intimidating, but it will become clear with a little practice.

🔗

Definition 1.2.27.

If \(T:V\to W\) is a linear transformation, \(\bcal\) a basis for \(V\) and \(\ccal\) a basis for \(W\text{,}\) we say that the matrix \(\coords{T}{\ccal,\bcal}\) is the matrix associated to \(T\) with respect to these bases.

🔗

Example 1.2.28.

Consider \(T:\pbb_3\to\pbb_2\) where \(T(p)=p'\text{.}\) If we choose the bases \(\bcal=\{x^3,x^2,x,1\}\) and \(\ccal=\{x^2,x,1\}\text{,}\) then

\begin{equation*} \coords{T}{\ccal,\bcal}=\begin{bmatrix} 3 \amp 0 \amp 0 \amp 0 \\ 0 \amp 2 \amp 0 \amp 0 \\ 0 \amp 0 \amp 1 \amp 0 \\ \end{bmatrix}\text{.} \end{equation*}

🔗

The next proposition says that the composition of linear transformations corresponds to matrix multiplication.

🔗

Proposition 1.2.29.

If \(T:V\to W\) and \(S:W\to U\) are linear transformations and \(\bcal\text{,}\) \(\ccal\text{,}\) and \(\dcal\) are bases for \(V\text{,}\) \(W\text{,}\) and \(U\text{,}\) respectively, then

\begin{equation*} \coords{ST}{\dcal, \bcal} = \coords{S}{\dcal,\ccal} \coords{T}{\ccal,\bcal} \text{.} \end{equation*}

🔗

Proof.

We denote the vectors in the bases by \(\basis{\vvec}{m}\text{,}\) \(\basis{\wvec}{n}\text{,}\) and \(\basis{\uvec}{p}\text{,}\) respectively. Similarly, we use the shorthand

\begin{align*} A \amp = \coords{T}{\ccal,\bcal}\\ B \amp = \coords{S}{\dcal,\ccal}\\ C \amp = \coords{ST}{\dcal,\bcal} \end{align*}

🔗

We have

\begin{align*} T\vvec_j \amp = \sum_k A_{k,j} \wvec_k\\ S\wvec_k \amp = \sum_l B_{l,k} \uvec_l\text{,} \end{align*}

which implies that

\begin{align*} (ST)\vvec_j \amp = S\left(\sum_kA_{k,j}\wvec_k\right)\\ \amp = \sum_k A_{k,j} S(\wvec_k)\\ \amp = \sum_k A_{k,j} \sum_l B_{l,k}\uvec_l\\ \amp = \sum_l \sum_k B_{l,k}A_{k,j} \uvec_l\\ \amp = \sum_l C_{l,j} \uvec_l. \end{align*}

Therefore, \(C_{l,j} = \sum_kB_{l,k}A_{k,j}\text{,}\) which says that \(C=BA\) as expected.

🔗

A similar result holds for the coordinate representations of vectors.

🔗

Proposition 1.2.30.

Suppose that \(T:V\to W\) is a linear transformation and \(\bcal\) is a basis for \(V\) and \(\ccal\) is a basis for \(W\text{.}\) If \(\vvec\) is a vector in \(V\text{,}\) then

\begin{equation*} \coords{T(\vvec)}{\ccal} = \coords{T}{\ccal,\bcal} \coords{v}{\bcal}\text{.} \end{equation*}

🔗

An important example is when the linear transformation is the identity \(I:V\to V\) and we have two bases \(\bcal\) and \(\ccal\) for \(V\text{.}\) In this case,

\begin{equation*} \coords{I}{\ccal,\bcal} = \left[ \begin{array}{ccc} \coords{\vvec_1}{\ccal} \amp \ldots \amp \coords{\vvec_n}{\ccal} \end{array} \right]\text{.} \end{equation*}

This matrix then represents the change of coordinates

\begin{equation*} \coords{\vvec}{\ccal} = \coords{I}{\ccal,\bcal} \coords{\vvec}{\bcal}\text{.} \end{equation*}

🔗

Example 1.2.31.

Suppose that \(V = \pbb_2\) and that \(\bcal=\{1+x,x-x^2,x^2-1\}\) and \(\ccal=\{1,x,x^2\}\text{.}\) Then

\begin{align*} \coords{I}{\ccal,\bcal}\amp = \begin{bmatrix} \coords{1+x}{\ccal} \amp \coords{x-x^2}{\ccal} \amp \coords{x^2-1}{\ccal} \end{bmatrix}\\ \amp = \begin{bmatrix} 1 \amp 0 \amp -1 \\ 1 \amp 1 \amp 0 \\ 0 \amp -1 \amp 1 \end{bmatrix}. \end{align*}

This matrix converts the coordinate representation of a polynomials in the \(\bcal\) basis into the coordinate representation of the same polynomial in the \(\ccal\) basis.

🔗

The inverse of this matrix will convert the \(\ccal\)-coordinate representation of a polynomial into the \(\bcal\)-coordinate representation:

\begin{equation*} \coords{I}{\bcal,\ccal} = \coords{I}{\ccal,\bcal}^{-1} = \begin{bmatrix} 1/2 \amp 1/2 \amp 1/2 \\ -1/2 \amp 1/2 \amp -1/2 \\ -1/2 \amp 1/2 \amp 1/2 \\ \end{bmatrix}\text{.} \end{equation*}

🔗

Consider the polynomial \(p(x)=4-2x+8x^2\text{.}\) We then have

\begin{equation*} \coords{p}{\ccal} = \threevec4{-2}8 \end{equation*}

and

\begin{equation*} \coords{p}{\bcal} = \begin{bmatrix} 1/2 \amp 1/2 \amp 1/2 \\ -1/2 \amp 1/2 \amp -1/2 \\ -1/2 \amp 1/2 \amp 1/2 \\ \end{bmatrix} \threevec4{-2}8 = \threevec5{-7}1\text{.} \end{equation*}

This means that

\begin{equation*} p(x) = 4-2x+8x^2 = 5(x+1)-7(x-x^2)+1(x^2-1) \end{equation*}

as is easily checked.

🔗

We will often be interested in linear transformations \(T:V\to V\) in which the codomain and the domain are the same vector space. In this case, we say that \(T\) is an operator on \(V\text{.}\)

🔗

Definition 1.2.32.

A linear transformation \(T:V\to V\) is called an operator on \(V\text{.}\)

🔗

Given a basis \(\bcal\) for \(V\text{,}\) we can then represent \(T\) in terms of this basis as \(\coords{T}{\bcal,\bcal}\) where the same basis is used for the codomain and domain. The following shows how the matrices representing the same transformation are represented in two bases.

🔗

Proposition 1.2.33.

Suppose that \(T:V\to V\) is a linear transformation and that \(\bcal\) and \(\ccal\) are two bases for \(V\text{.}\) Then

\begin{equation*} \coords{T}{\ccal,\ccal} = \coords{I}{\ccal,\bcal} \coords{T}{\bcal,\bcal} \coords{I}{\bcal,\ccal} = \coords{I}{\ccal,\bcal} \coords{T}{\bcal,\bcal} \coords{I}{\ccal,\bcal}^{-1}\text{.} \end{equation*}

🔗

Here is a simpler way to represent this statement. If \(B = \coords{T}{\ccal,\ccal}\text{,}\) \(A=\coords{T}{\bcal,\bcal}\text{,}\) and \(P=\coords{I}{\ccal,\bcal}\text{,}\) then we have

\begin{equation*} B = PAP^{-1}\text{.} \end{equation*}

This should remind you of the kind of expression we saw when we were diagonalizing matrices and gives some idea for where we are heading. In particular, given an operator \(T\) on a vector space \(V\text{,}\) we would like to find a basis \(\bcal\) for \(V\) so that \(\coords{T}{\bcal,\bcal}\) is relatively simple, such as a diagonal matrix. The expression in Proposition 1.2.33 can help us do that, even though that expression may look daunting.

🔗

Definition 1.2.34.

Two square \(n\times n\) matrices \(A\) and \(B\) are called similar if there is a matrix \(P\) such that

\begin{equation*} B = PAP^{-1}\text{.} \end{equation*}

🔗

Notice that a matrix is diagonalizable precisely when it is similar to a diagonal matrix.

🔗

Proposition 1.2.35.

Similarity is an equivalence relation on the set of \(n\times n\) matrices.

🔗

Proposition 1.2.36.

Suppose that \(A\) and \(B\) are similar \(n\times n\) matrices. If \(\bcal\) is a basis for \(V\) and \(A=\coords{T}{\bcal,\bcal}\text{,}\) then \(B=\coords{T}{\ccal,\ccal}\) for some other basis \(\ccal\text{.}\)

🔗

In other words, two similar matrices represent the same linear transformations in two different bases. This is why we should expect that similar matrices should share important properties, such as determinants, eigenvalues, and more.

🔗

We close this section by noting that the set of linear transformations is itself a vector space.

🔗

Definition 1.2.37.

If \(V\) and \(W\) are two vector spaces, then \(L(V,W)\) is the set of linear transformations \(T:V\to W\text{.}\)

🔗

Proposition 1.2.38.

If \(V\) and \(W\) are two vector spaces, then \(L(V,W)\) is a vector space. Moreover, if \(\bcal\) is a finite basis for \(V\) and \(\ccal\) is a finite basis for \(W\text{,}\) then the function \(S:L(V,W)\to \field^{m,n}\text{,}\) the vector space of \(m\times n\) matrices, by

\begin{equation*} S(T) = \coords{T}{\ccal,\bcal} \end{equation*}

is an isomorphism. It then follows that

\begin{equation*} \dim L(V,W) = \dim V \dim W\text{.} \end{equation*}

🔗

Subsection 1.2.5 Dual spaces

A special case of the vector space \(L(V,W)\) of linear transformations from \(V\) to \(W\) is when \(W=\field\text{.}\) In this case, we call \(L(V,\field)\) the dual of \(V\text{.}\)

🔗

Definition 1.2.39.

The vector space of linear transformations \(L(V,\field)\) is called the dual of \(V\) and is denoted by

\begin{equation*} V'=L(V,\field)\text{.} \end{equation*}

An element of \(V'\) is called a functional on \(V\text{.}\)

🔗

Up to this point, we have typically used \(T\) to denote a linear transformation. However, we often use \(\phi\) to represent an element of the dual.

🔗

Example 1.2.40.

Suppose that \(V=\real^3\text{.}\) Then

\begin{equation*} \phi\left(\threevec xyz\right) = 2x + 3y -z \end{equation*}

is a functional on \(V\) and hence an element of the dual \((\real^3)'\text{.}\)

🔗

Notice that we can represent this functional in terms of the dot product if we define the vector \(\uvec=\threevec23{-1}\text{.}\) Then

\begin{equation*} \phi(\vvec) = \uvec\cdot\vvec\text{.} \end{equation*}

We will later understand this example as an illustration of the Riesz Representation Theorem 1.3.17. More generally, any vector \(\uvec\) in \(\real^3\) defines a functional by \(\phi(\vvec)=\uvec\cdot\vvec\text{.}\)

🔗

Example 1.2.41.

If \(V=\pbb_3\text{,}\) then \(\phi(p)=p(0)\) is a functional on \(V\text{.}\)

🔗

By Proposition 1.2.38, we know that \(V'\) is a vector space and that \(\dim V' = \dim V\text{.}\) Therefore, if \(\dim V=n\text{,}\) we see that \(\dim V' = n\) as well and given a basis for \(V\text{,}\) we can construct an explicit isomorphism \(T:V'\to \real^n \text{.}\)

🔗

Suppose that \(\basis{\vvec}{n}\) is a basis for \(V\) and that \(\phi\) is a functional on \(V\text{.}\) If \(\vvec\) is in \(V\text{,}\) then we can write

\begin{equation*} \vvec = a_1\vvec_1 + a_2\vvec_2 + \ldots + a_n\vvec_n \end{equation*}

and hence

\begin{equation*} \phi(\vvec) = a_1\phi(\vvec_1) + a_2\phi(\vvec_2) + \ldots + a_n\phi(\vvec_n)\text{.} \end{equation*}

This shows that \(\phi\) is determined by the vector \(\cfourvec{\phi(\vvec_1)}{\phi(\vvec_2)}{\vdots}{\phi(\vvec_n)}\text{.}\) We therefore define the isomorphism \(T:V'\to \real^n\) by

\begin{equation*} T(\phi) = \cfourvec{\phi(\vvec_1)}{\phi(\vvec_2)}{\vdots}{\phi(\vvec_n)}\text{.} \end{equation*}

We see that \(T\) is an isomorphism because it is injective: if \(T(\phi) = \zerovec\text{,}\) then \(\phi(\vvec_j)=0\) for all \(j\) and hence

\begin{equation*} \phi(\vvec) = a_1\phi(\vvec_1) + a_2\phi(\vvec_2) + \ldots + a_n\phi(\vvec_n) = 0 \end{equation*}

for all vectors \(\vvec\text{.}\) Since \(T\) is injective and \(\dim V'=\dim \real^n\text{,}\) \(T\) is also surjective by Fundamental Theorem of Linear Maps 1.2.13.

🔗

Proposition 1.2.42.

If \(V\) is a finite dimensional vector space, the dual space \(V'\) satisfies

\begin{equation*} \dim V' = \dim V \end{equation*}

and an isomorphism is given once we choose a basis for \(V\text{.}\)

🔗

Example 1.2.43.

Consider \(V=\pbb_2\) with basis \(1,x,x^2\text{.}\) Define functionals \(\phi_1\text{,}\) \(\phi_2\text{,}\) and \(\phi_3\) by

\begin{gather*} \phi_1(1) = 1, \phi_1(x) = 0, \phi_1(x^2) =0\\ \phi_2(1) = 0, \phi_2(x) = 1, \phi_2(x^2) =0\\ \phi_3(1) = 0, \phi_3(x) = 0, \phi_3(x^2) =1\text{.} \end{gather*}

Then any functional \(\phi\) with

\begin{equation*} \phi(1) = a_1, \phi(x)=a_2, \phi(x^2)=a_3 \end{equation*}

satisfies \(\phi=a_1\phi_1 + a_2\phi_2+a_3\phi_3\text{.}\)

🔗

Prev Top Next