Skip to main content

Section 1.1 Vector spaces

A vector space is simply a mathematical set on which we can perform addition and scalar multiplication. We already have some familiarity with vector spaces since \(\real^n\) is a good example. However, as mentioned in the introduction to this chapter, polynomials have similar operations so we would like to create a mathematical structure that allows us to study vectors and polynomials as equals. This is why the concept of a vector space is so useful.

Subsection 1.1.1 Vector spaces

The usual place to get started would be with a general definition of a vector space. However, this is one place in mathematics, among others, where a general definition can obscure the underlying idea. For that reason, let’s just start with some examples.

Example 1.1.1. Matrices.

Let’s look at the set of all \(3\times 2\) matrices, which include the matrices
\begin{equation*} A=\begin{bmatrix} 3 \amp -1 \\ 0 \amp 2 \\ 4 \amp -3 \\ \end{bmatrix}, \hspace{0.5in} B=\begin{bmatrix} 1 \amp 3 \\ -1 \amp 0 \\ 2 \amp 4 \\ \end{bmatrix}\text{.} \end{equation*}
As we saw in our earlier course, we can multiply a matrix by a scalar and we can add matrices:
\begin{equation*} -3A=\begin{bmatrix} -9 \amp 3 \\ 0 \amp -6 \\ -12 \amp 9 \\ \end{bmatrix}, \hspace{24pt} A+B=\begin{bmatrix} 4 \amp 2 \\ -1 \amp 2 \\ 6 \amp 1 \\ \end{bmatrix}\text{.} \end{equation*}
Notice that both operations produce a new object that is also a \(3\times2\) matrix. We say that that the set is closed under these operations.
With these operations, the set of \(3\times2\) matrices becomes a vector space.
Notice that the entries in our matrices are real numbers \(\real\text{.}\) We could instead change the example so that we consider \(3\times 2\) matrices whose entries are in the complex numbers \(\complex\text{.}\)

Example 1.1.2. Complex matrices.

Consider now the set of \(1\times2\) matrices with complex entries. For example,
\begin{equation*} A=\begin{bmatrix} 2-3i \amp 4 \\ \end{bmatrix},\hspace{24pt} B=\begin{bmatrix} i \amp 1+i \\ \end{bmatrix}\text{.} \end{equation*}
Scalar multiplication includes multiplication by complex numbers so we have
\begin{equation*} (3+i)A = \begin{bmatrix} 9-7i \amp 12+4i \\ \end{bmatrix},\hspace{24pt} A+B = \begin{bmatrix} 2-2i \amp 5+i \end{bmatrix}\text{.} \end{equation*}
These examples show that vector spaces have an underlying field, which is the set of scalars by which we can multiply. You may or may not know about fields depending on whether you have studied abstract algebra. In either case, the underlying field of our vector spaces will always be either the real numbers or the complex numbers, which we will write as \(\field=\real\) or \(\field=\complex\text{.}\)
Having seen some examples, we offer a general definition of a vector space.

Definition 1.1.3. Vector space.

A vector space over a field \(\field\) is a set \(V\) with two operations, scalar multiplication by elements of \(\field\) and addition, under which \(V\) is closed. Moreover, these operations satisfy the following natural properties:
  • Addition is commutative; that is \(\vvec+\wvec=\wvec+\vvec\) for every pair of \(\vvec,\wvec\in V\text{.}\)
  • There is an additive identity; that is, there is an element \(0\in V\) such that \(0+\vvec = \vvec\) for any element in \(V\text{.}\)
  • Every element \(\vvec\) has an additive inverse \(\wvec\) such that \(\vvec + \wvec = 0\text{.}\) We will usually write the additive identity as \(-\vvec\text{.}\)
  • Addition is associative, which means that we can regroup a sum in the following way:
    \begin{equation*} (\uvec+\vvec) + \wvec = \uvec + (\vvec+\wvec)\text{.} \end{equation*}
  • For every \(\vvec\in V\text{,}\) we have \(1\vvec = \vvec\text{.}\)
  • Scalar multiplication is distributive in the sense that
    \begin{equation*} s(\vvec+\wvec) = s\vvec + s\wvec\text{.} \end{equation*}
That is a long list of properties. Technically speaking, if we want to check that some set is a vector space, we need to check each one of those properties. In practice, however, we will know a vector space when we see one, and we will be fairly loose with these details.

Example 1.1.4. Polynomials.

If \(\field=\real\) or \(\field=\complex\text{,}\) the set of polynomials whose coefficients are in \(\field\) form a vector space \(\pbb\text{.}\)

Example 1.1.5. Polynomials of degree \(n\).

Rather than the set of all polynomials, we define the set \(\pbb_n\) to be the set of all polynomials whose degree is \(n\) or less. For example, \(\pbb_2\) contains all polynomials of degree two or less:
\begin{equation*} p(x) = a_2x^2 + a_1x + a_0 \end{equation*}
where the coefficients \(a_j\) are assumed to be either real or complex, as will be either specified or clear from the context.
Of course, the set of all polynomials is larger than the set of quadratic polynomials, and we have \(\pbb_2\subset \pbb\text{.}\) We say that \(\pbb_2\) is a vector subspace of \(\pbb\text{.}\)

Definition 1.1.6. Vector subspace.

A subset \(W\) of a vector space \(V\) is called a subspace of \(V\) if \(W\) is closed under the operations of scalar multiplication and addition that it inherits from \(V\text{.}\)
Notice that a subspace is itself a vector space and that the underlying fields of \(V\) and \(W\) are the same.
Every vector space \(V\) has two subspaces that we will frequently need to consider. Namely, the subspace consisting of only the zero vector \(W=\{0\}\) and the entire vector space \(W=V\) itself.

Example 1.1.7. Function spaces.

Let \(\fcal\) be the set of functions whose domain is \(\real\) and whose codomain is \(\complex\text{;}\) that is, functions of the form \(f:\real\to\complex\text{.}\) It follows that \(\fcal\) is a complex vector space.
If we were to consider functions \(f:\real\to\real\text{,}\) we would obtain a real vector space. This is not a subspace of \(\fcal\text{,}\) however, since the underlying fields are different. Rather, here are some natural subspaces of \(\fcal\text{.}\)

Example 1.1.8.

The following are subspaces of \(\fcal\text{:}\)
  • The set of functions \(f:\real\to\complex\) for which \(f(17)=0\text{.}\)
  • The set of periodic functions whose period is 7; that is functions that satisfy \(f(x+7)=f(x)\) for all \(x\text{.}\)
  • The set of continous functions.
The set of functions that satisfy \(f(17)=1\) is, however, not a subspace since it is not closed under scalar multiplication or vector addition.

Example 1.1.9.

If \(V\) is a vector space and \(V_1\) and \(V_2\) are subspaces, then \(V_1\cap V_2\) is also a subspace of \(V\) as it can be seen that the interection is closed under scalar multiplication and addition.
When working with a vector space \(V\text{,}\) we will frequently refer to the elements of \(V\) as vectors even though they may be polynomials, matrices, functions, or even something entirely different.

Subsection 1.1.2 Linear combinations

Our study of linear algebra really began once we introduced linear combinations. Of course, linear combinations are defined purely in terms of scalar multiplication and addition so we can form linear combinations of elements in a vector space.

Definition 1.1.10.

Suppose that \(\vvec_1,\ldots,\vvec_n\) is a set of vectors in a vector space \(V\) over a field \(\field\text{.}\) A linear combination of these vectors is a vector of the form
\begin{equation*} a_1\vvec_1 + a_2\vvec_2 + \ldots + a_m\vvec_m \end{equation*}
where the scalars \(a_j\) belong to the field \(\field\text{.}\)

Example 1.1.11.

Consider the vector space \(\pbb_2\) consisting of polynomials having degree two or less and the polynomials \(p_1(x)=3x+4\) and \(p_2(x)=7x^2-2x+1\text{.}\) We can form the linear combination
\begin{equation*} 2p_1(x)-3p_2(x) = -21x^2 +5\text{.} \end{equation*}
We can also think about concepts like span and linear independence.

Definition 1.1.12. Span.

The span of a set of vectors in a vector space is the set of all linear combinations that can be formed from the set.
It’s not hard to see that the span of a set of vectors \(\vvec_1,\vvec_2,\ldots,\vvec_m\) in \(V\) forms a subspace. We just have to check that the span is closed under scalar multiplication and addition. So we will consider vectors
\begin{align*} \uvec \amp = a_1\vvec_1 + a_2 \vvec_2 + \ldots + a_m\vvec_m\\ \wvec \amp = b_1\vvec_1 + b_2 \vvec_2 + \ldots + b_m\vvec_m\text{.} \end{align*}
If we multiply \(\uvec\) by the scalar \(s\text{,}\) we have
\begin{equation*} s\uvec = (sa_1)\vvec_1 + (sa_2) \vvec_2 + \ldots + (sa_m)\vvec_m\text{,} \end{equation*}
which is in the span of the set of vectors. Similarly,
\begin{equation*} \uvec+\wvec = (a_1+b_1)\vvec_1 + (a_2+b_2) \vvec_2 + \ldots + (a_m+b_m)\vvec_m\text{,} \end{equation*}
which is also in the span. This demonstrates the following proposition.
We can also define linear dependence as before.

Definition 1.1.14. Linear independence.

A set of vectors in \(V\) is linearly dependent if one of the vectors can be written as a linear combination of the others.

Example 1.1.15.

In \(\pbb_2\text{,}\) consider the polynomials
\begin{equation*} p_1(x)=x^2-x+2,\hspace{24pt} p_2(x)=3x^2+4x-1,\hspace{24pt} p_3(x)=-7x+7\text{.} \end{equation*}
This set of polynomials is linear dependent because \(p_3=3p_1-p_2\text{.}\)
Notice that this also says that
\begin{equation*} 3p_1 - p_2 - p_3 = 0\text{,} \end{equation*}
which leads to the next proposition.

Proof.

The second statement is logically equivalent to the first so our proof will focus on the first statement. Suppose that the set \(\vvec_1,\ldots,\vvec_m\) is linearly dependent and that \(\vvec_k\) is the first vector that is a linear combination of vectors that occur previously in the list. This means that there are scalars \(c_1,c_2,\ldots,c_{k-1}\) such that
\begin{equation*} \vvec_k = c_1\vvec_1+c_2\vvec_2+\ldots+c_{k-1}\vvec_{k-1}\text{.} \end{equation*}
We can rewrite this expression as
\begin{equation*} c_1\vvec_1+c_2\vvec_2+\ldots+c_{k-1}\vvec_{k-1}-\vvec_k = 0\text{,} \end{equation*}
which means that there are scalars \(a_j\) with
\begin{equation*} a_1\vvec_1+a_2\vvec_2+\ldots+a_m\vvec_m = 0\text{.} \end{equation*}
Conversely, suppose that
\begin{equation*} a_1\vvec_1+a_2\vvec_2+\ldots+a_m\vvec_m = 0 \end{equation*}
for some set of scalars and that \(a_k\) be the last nonzero scalar. We can rewrite this expression as
\begin{equation*} \vvec_k = -\frac{a_1}{a_k}\vvec_1 -\frac{a_2}{a_k}\vvec_2 -\ldots -\frac{a_{k-1}}{a_k}\vvec_{k-1}\text{.} \end{equation*}
This shows that \(\vvec_k\) is a linear combination of the other vectors and that the set of vectors is therefore linearly dependent.

Proof.

If \(\wvec = a_1\vvec_1+a_2\vvec_2+\ldots+a_m\vvec_m\text{,}\) then we can replace \(\vvec_j\) in this expression with a linear combination of the other vectors. This shows that \(\wvec\) can be written as a linear combination of the set of vectors with \(\vvec_j\) removed.

Subsection 1.1.3 Bases

Definition 1.1.18.

A set of vectors in a vector space \(V\) forms a basis for \(V\) is the set is linearly independent and its span is \(V\text{.}\)

Example 1.1.19.

We can see that the polynomials
\begin{equation*} p_1(x)=1,\hspace{12pt} p_2(x)=x,\hspace{12pt} p_3(x)=x^2 \end{equation*}
form a basis of \(\pbb_2\text{.}\) Notice that this statement is true for both \(\field=\real\) and \(\field=\complex\text{.}\)
First, every polynomial \(p\) in \(\pbb_2\) can be written as
\begin{equation*} p(x)=a_0 + a_1x + a_2x^2\text{,} \end{equation*}
showing that \(p_1\text{,}\) \(p_2\text{,}\) and \(p_3\) span \(\pbb_2\text{.}\) To see that these polynomials are linearly independent, suppose that
\begin{equation*} a_1 p_1(x) + a_2p_2(x) + a_3p_3(x) = 0\text{,} \end{equation*}
the additivity identiy in \(\pbb_2\text{.}\) We therefore have
\begin{equation*} a_1 + a_2x + a_3x^2 = 0 + 0x+0x^2 \end{equation*}
from which we conclude that \(a_1=0\text{,}\) \(a_2=0\text{,}\) and \(a_3=0\text{.}\) Therefore, \(p_1\text{,}\) \(p_2\text{,}\) and \(p_3\) are linearly indepedent by Proposition 1.1.16 and hence form a basis for \(\pbb_2\text{.}\)

Example 1.1.20.

The polynomials
\begin{equation*} q_1(x)=x^2+3x,\hspace{24pt} q_2(x)=-x^2+x,\hspace{24pt} q_3(x)=2x^2+4x+2 \end{equation*}
form a basis of \(\pbb_2\text{.}\)
To see this, suppose that \(p(x)=a_0 + a_1x + a_2x^2\) is a polynomial in \(\pbb_\text{.}\) We wish to see that \(p\) can be written as a linear combination of \(q_1\text{,}\) \(q_2\text{,}\) and \(q_3\text{.}\) This means that there are scalars \(c_1\text{,}\) \(c_2\text{,}\) and \(c_3\) such that
\begin{align*} c_1q_1 + c_2q_2 +c_3q_3 \amp = p \\ c_1(x^2+3x) + c_2(-x^2+x) + c_3(2x^2 + 4x + 2) \amp = a_0 + a_1x + a_2x^2\\ (c_1-c_2+2c_2)x^2 + (3c_1+c_2+4c_3) + 2c_3 \amp = a_0 + a_1x + a_2x^2 \end{align*}
This is a linear system of three equations in the three variables \(c_1\text{,}\) \(c_2\text{,}\) and \(c_3\text{,}\) which may be written as
\begin{equation*} \begin{bmatrix} 1 \amp -1 \amp 2 \\ 3 \amp 1 \amp 4 \\ 0 \amp 0 \amp 2 \\ \end{bmatrix} \threevec{c_1}{c_2}{c_3} = \threevec{a_0}{a_1}{a_2}\text{,} \end{equation*}
which has a unique solution for every vector \(\threevec{a_0}{a_1}{a_2}\text{.}\) This says that \(\laspan{q_1,q_2,q_3} = \pbb_2\) and that these polynomials are linearly independent.

Example 1.1.21.

Consider the set of polynomials
\begin{align*} p_0 \amp = 1 \\ p_1 \amp = 1 +x \\ p_2 \amp = 1 +x + x^2\\ \amp \vdots \\ p_n \amp = 1 +x+x^2+\ldots+x^n \end{align*}
in \(\pbb_n\text{.}\) We claim that these polynomials form a basis for \(\pbb_n\text{.}\)
To see that they are linearly independent, we will suppose that they are linearly dependent and derive a contradiction. Suppose that
\begin{equation*} c_0p_0 + c_1p_1 + \ldots c_np_n = 0 \end{equation*}
and that some of the scalars are nonzero. Let \(c_k\) be the last nonzero scalar so that
\begin{equation*} c_0p_0 + c_1p_1 + \ldots + c_kp_k = c_kx^k + \text{ lower order terms}\text{.} \end{equation*}
That is, \(c_kx^k\) is the only term involving \(x^k\text{.}\) Therefore, \(c_k=0\text{,}\) which contradicts our assumption that \(c_k\neq 0\text{.}\)
To see that these polynomials span \(\pbb_n\text{,}\) we offer a proof by induction. When \(n=0\text{,}\) we see that \(p_0 = 1\) spans \(\pbb_0\text{.}\) Now suppose that \(p_0,p_1,\ldots,p_{n-1}\) span \(\pbb_{n-1}\) and that \(p(x)=a_0+a_1x+a_2x^2 + \ldots + a_nx^n\) is a polynomial in \(\pbb_n\text{.}\) Notice that the polynomials \(p(x)\) and \(a_np_n(x)\) have the same coefficient of \(x^n\text{.}\) Therefore,
\begin{equation*} p(x) - a_np_n(x) \end{equation*}
is a polynomial in \(\pbb_{n-1}\) and can be written as a linear combination of \(p_0,p_1,\ldots,p_{n-1}\text{.}\) This means that
\begin{align*} p - a_np_n \amp = c_0p_0+c_1p_1+\ldots+c_{n-1}p_{n-1}\\ p \amp = c_0p_0+c_1p_1+\ldots+c_{n-1}p_{n-1} + a_np_n \end{align*}

Example 1.1.22.

There is no finite set that forms a basis for \(\pbb\text{,}\) the set of all polynomials. Given any finite set, there is a polynomial having a highest degree \(m\text{.}\) Therefore, the polynomial \(x^{m+1}\) is not in the span of the set so it cannot be a basis.

Definition 1.1.23.

We say that a vector space \(V\) is finite dimensional if there is a finite set whose span is \(V\text{.}\) Otherwise, we say that \(V\) is infinite dimensional.
Notice that any finite dimensional vector space must have a basis.

Proof.

If \(V\) is a finite dimensional vector space, there is a finite set of vectors whose span is \(V\text{.}\) If this set of vectors is linearly independent, then it forms a basis. It not, we can remove one vector that is a linear combination of the others. Proposition 1.1.17 says that the span of the remaining vectors is still \(V\) so we continue removing vectors one at a time until we obtain a linearly independent set, which must be a basis.
Notice that the two bases for \(\pbb_2\) in Example 1.1.19 and Example 1.1.20 both consist of three polynomials. This is generally true as we will begin to explain. First, we will prove a more technical, but still useful, result.

Proof.

Suppose that \(\vvec_1,\vvec_2,\ldots,\vvec_m\) is a linear independent set in the vector space \(V\) and that \(\wvec_1,\wvec_2,\ldots,\wvec_n\) is a set whose span is \(V\text{.}\) We wish to show that \(m\leq n\text{.}\)
We first construct a new list
\begin{equation*} \vvec_m,\wvec_1,\wvec_2,\ldots,\wvec_n \end{equation*}
whose span is \(V\text{.}\) Because the span of the \(\wvec\) vectors is \(V\text{,}\) \(\vvec_m\) is a linear combination of the \(\wvec\) vectors, which means that this set of vectors is linearly dependent. We let \(\uvec\) be the first vector in the list that is a linear combination of vectors that occur previously in the list. Since the set of \(\vvec\) vectors is linearly independent, \(\vvec_m\) is nonzero, which means that \(\uvec\) must be one of the \(\wvec\) vectors. If we remove \(\uvec\text{,}\) we have a new list
\begin{equation*} \vvec_m,\wvec_1,\ldots,\widehat{\wvec_j},\ldots,\wvec_n \end{equation*}
whose span is \(V\) by Proposition 1.1.17. Notice that the cardinality of this new list is \(n\text{.}\)
We can repeat this process. We prepend \(\vvec_{m-1}\) to the list to obtain
\begin{equation*} \vvec_{m-1},\vvec_m,\wvec_1,\ldots,\widehat{\wvec_j},\ldots,\wvec_n, \end{equation*}
which must be linearly dependent. Let \(\uvec\) be the first vector in the list that is a linear combination of vectors that occur previously in the list. Once again, since the \(\vvec\) vectors form a linearly independent set, we know that \(\uvec\) is one of the \(\wvec\) vectors. We can remove \(\uvec\) to obtain a new list of vectors whose span is \(V\text{.}\) Again, the cardinality of this new list is \(n\text{.}\)
We continue this process until all the \(\vvec\) vectors have been added to the beginning of the list. At each step, the vector we remove is one of the \(\wvec\) vectors since the \(\vvec\) vectors are linearly independent. Therefore, we have a list of \(n\) vectors that contains \(\vvec_1,\vvec_2,\ldots,\vvec_m\text{,}\) which says that \(m\leq n\text{.}\)

Proof.

Suppose that \(\vvec_1,\vvec_2,\ldots,\vvec_m\) is one basis and that \(\wvec_1,\wvec_2,\ldots,\wvec_n\) is another. The set of \(\vvec\) vectors forms a linearly independent set and the set of \(\wvec\) vectors spans \(V\text{.}\) By Proposition 1.1.25, we know that \(m\leq n\text{.}\)
We can repeat this argument interchanging the two bases to conclude that \(n\leq m\text{.}\) Put together, these two facts mean that \(m=n\text{.}\)
If \(V\) is a finite dimensional vector space, we define its dimension to be the number of vectors in a basis. In this case, the number of vectors in any basis is the same so this definition does not depend on which basis we choose.

Definition 1.1.27.

If a vector space \(V\) has a basis with \(n\) vectors, we say that the dimension of \(V\) is \(n\) and write \(\dim V = n\text{.}\)
We may informally think of the dimension of a vector space as a measure of its size. Therefore, it should follow that the dimension of a subspace cannot be larger than the dimension of the vector space in which it resides. We first call attention to a useful fact.

Proof.

Under the assumptions of this proposition, the span of \(\vvec_1,\vvec_2,\ldots,\vvec_m\) is not \(V\) so there is a vector \(\uvec\) that is not in the span of the \(\vvec\) vectors. This means that it is not a linear combination of the \(\vvec\) vectors and therefore
\begin{equation*} \vvec_1,\vvec_2,\ldots,\vvec_m,\uvec \end{equation*}
is a linearly independent set.

Proof.

Any linear independent subset of \(V\) can have no more than \(n\) vectors by Proposition 1.1.25. If this linearly independent set of \(n\) vectors does not span \(V\text{,}\) then by Proposition 1.1.28, we can add a vector \(\uvec\) so that
\begin{equation*} \vvec_1,\vvec_2,\ldots,\vvec_n,\uvec \end{equation*}
is a linearly independent subset of \(V\text{.}\) This cannot happen, however, since this set would have \(n+1\) vectors. Therefore, \(\vvec_1,\vvec_2,\ldots,\vvec_n\) must span \(V\text{.}\)

Proof.

We will first explain why \(W\) is a finite dimensional vector space, which means we need to explain why there is a finite set \(W\) that spans \(W\text{.}\) We begin with any set of vectors \(\wvec_1,\wvec_2,\ldots,\wvec_m\) in \(W\text{.}\) By Proposition 1.1.17, we can remove vectors one at a time until we obtain a linearly independent set in \(W\text{.}\) If this set does not span \(W\text{,}\) then we can add vectors in \(W\) one at a time to obtain new linearly independent sets in \(W\text{.}\) This process must stop at some point since any linearly independent set in \(V\) can have no more than \(\dim V\) vectors. Therefore, we have obtained a finite set that spans \(W\text{,}\) which says that \(W\) is finite dimensional.
Since any basis for \(W\) is also a linearly independent subset of \(V\text{,}\) it can contain no more vectors than a basis of \(V\text{.}\) This tells us that
\begin{equation*} \dim W \leq \dim V\text{.} \end{equation*}

Proof.

Suppose that \(\vvec_1,\vvec_2,\ldots,\vvec_m\) is a linearly independent set in \(V\) and that \(\wvec_1,\wvec_2,\ldots,\wvec_n\) is a basis for \(V\text{.}\) Join the two lists together to obtain
\begin{equation*} \vvec_1,\ldots,\vvec_m,\wvec_1,\ldots,\wvec_n\text{.} \end{equation*}
We are guaranteed that the span of this set is \(V\text{.}\) If it is not a linear independent set, then we remove the first vector that is a linear combination of the others. Since the \(\vvec\) vectors are linearly independent, the vector that is removed must be one of the \(\wvec\) vectors. Continuing in this way, we eventually obtain a basis that includes the vectors \(\vvec_1,\ldots,\vvec_m\text{.}\)
The following is a consequence of Proposition 1.1.29.
Some further conseqences of these ideas follow.