Introduction: Vector space – Serlo

We already know the vector spaces $\mathbb {R} ^{2}$ and $\mathbb {R} ^{3}$ from school. There we got to know them in the form coordinate systems. The concept of a vector space is much broader in mathematics. In the following, we will develop the abstract mathematical concept of vector space starting from the vector spaces known from school. They have a wide application in science, technology and data analysis.

The vector space $\mathbb {R} ^{n}$

Explanation and definition of a vector space (YouTube- video by the YouTube channel "MJ Education")

In $\mathbb {R} ^{2}$ and $\mathbb {R} ^{3}$ we know vectors in the form of points in the plane or in the space. Sometimes we also encounter arrows as representatives of vectors in the coordinate system. vectors can be described in $\mathbb {R} ^{2}$ by two and in $\mathbb {R} ^{3}$ by three coordinates. The following map shows for example the arrow representation of the vector $v=(2,1)^{T}$ :

vector in

\mathbb {R} ^{2}

represented by an arrow

Often, however, three coordinates are not enough to represent all the desired information. This is shown by the following two examples:

Example (Radio probe)

Suppose, we send off a radio probe (balloon with measurement device), in order to investigate the earth's atmosphere. Beside the position of the probe (three data points) we record different measured data, namely temperature and air pressure. The three coordinates of the $\mathbb {R} ^{3}$ are already needed to represent the position of the probe. For the representation of the other measured values we need two more coordinates. We assume that the probe is located 20 m in eastern direction, 30 m in northern direction and in a height of 15 m starting from the measuring station. Our instruments in the radio probe show at this time a temperature of 13° C and an air pressure of 1 bar. To write down all recorded data at once, we write down the row vector $a=(20,30,15,13,1)^{T}$ . Here, the superscript T (transposed) allows the space-saving notation as a row vector. The notation as column vector is

a={\begin{pmatrix}20\\30\\15\\13\\1\end{pmatrix}}

Thus we are in $\mathbb {R} ^{5}$ instead of $\mathbb {R} ^{3}$ , because we need five instead of three numbers to describe the vector.

Example (Stocks)

We consider the stock values of 30 companies at a certain point of time. We can record those in a vector with 30 entries, where each entry stands for the value of the respective share at the specific point in time. We get a vector in $\mathbb {R} ^{30}$ that gives the current state of the financial market. We can extend the 30 values further by including other stocks. Ultimately, then, we can choose any natural number $n$ as the dimension of our "stock vector". The current state of a stock market can thus be encoded by a vector of the $\mathbb {R} ^{n}$ with $n$ entries.

We have seen from the examples that it can be useful to extend $\mathbb {R}$ by adding more dimensions to a general vector space $\mathbb {R} ^{n}$ . And there are many more examples! In the transition from $\mathbb {R} ^{2}$ to $\mathbb {R} ^{3}$ we can still vividly imagine that we increase the dimension by adding an independent direction. In higher dimensions we lack this geometric notion. However, we can imagine higher dimensional vector spaces very well in the tuple notation. An additional dimension can be achieved by adding another number. These numbers can all be chosen independently and we call them coordinates.

Generalization to $K^{n}$

So far we have created vector spaces by adding further dimensions to $\mathbb {R}$ . Now we want to look at which properties of the real numbers are relevant for this and, based on this, generalize the vector space notion further. We are familiar with the rules of $\mathbb {R}$ . We already know the vector addition and the scalar multiplication in $\mathbb {R} ^{2}$ and in $\mathbb {R} ^{3}$ and we can visualize these vividly.

Addition of two vectors im $\mathbb {R} ^{2}$
scalar multiplication im $\mathbb {R} ^{2}$

In the same way, however, we can also calculate in higher dimensions. Thus the sum of the vectors $v:=(-1,0,2,4)^{T}$ and $w:=(2,1,-1,0)^{T}\in \mathbb {R} ^{4}$ is just given by summing up the entries:

v+w={\begin{pmatrix}\color {red}{-1}\\0\\2\\4\end{pmatrix}}+{\begin{pmatrix}\color {red}{2}\\1\\-1\\0\end{pmatrix}}={\begin{pmatrix}\color {red}{-1+2}\\0+1\\2+(-1)\\4+0\end{pmatrix}}={\begin{pmatrix}\color {red}{1}\\1\\1\\4\end{pmatrix}}

The scalar multiplication of $v:=\left(0,3,-1,{\tfrac {1}{2}}\right)^{T}\in \mathbb {R} ^{4}$ with some $\alpha :=2$ is done by multiplying all entries separately:

\alpha \cdot v=2\cdot {\begin{pmatrix}0\\3\\\color {OliveGreen}{-1}\\{\tfrac {1}{2}}\end{pmatrix}}={\begin{pmatrix}2\cdot 0\\2\cdot 3\\\color {OliveGreen}{2\cdot (-1)}\\2\cdot {\tfrac {1}{2}}\end{pmatrix}}={\begin{pmatrix}0\\6\\\color {OliveGreen}{-2}\\1\end{pmatrix}}

Just as in $\mathbb {R} ^{4}$ we can proceed in general also in $\mathbb {R} ^{n}$ . Let us now consider which properties of $\mathbb {R}$ guarantee that a computation with vectors in $\mathbb {R} ^{n}$ is possible. We see from above examples that scalar multiplication and addition of vectors in each component corresponds to multiplication and of addition in $\mathbb {R}$ , respectively. Thus we compute in the first component of addition $\color {red}{-1+2=1}$ . Likewise we have that for scalar multiplication in the third component $\color {OliveGreen}{2\cdot (-1)=-2}$ .

So the arithmetic in $\mathbb {R} ^{n}$ is traced back to the addition and multiplication in $\mathbb {R}$ . Here we have another possibility for abstraction. A set in which one can add and multiply as in the real numbers is called a field (and $\mathbb {R}$ is such a field). So it should be sufficient if the numbers of the vector tuple come from a field. Thus we can form a vector space from every general field $K$ . So it also works for other fields like the rational numbers $\mathbb {Q}$ or the complex numbers $\mathbb {C}$ . Analogous to the $\mathbb {R} ^{n}$ we start with the field $K$ and build up a vector space $K^{n}$ by adding further "independent directions".

Example (The vector space $\mathbb {Q} ^{3}$ )

The vector space $\mathbb {Q} ^{3}$ is like the $\mathbb {R} ^{3}$ a set of tuples $(a,b,c)^{T}$ , only that the entries are exclusively rational numbers from $\mathbb {Q}$ and not real numbers from $\mathbb {R}$ . We have hence that $a,b,c\in \mathbb {Q}$ . Thus $(1,2,3)^{T}$ and $\left(-1,{\tfrac {1}{2}},-{\tfrac {42}{23}}\right)^{T}$ are vectors from $\mathbb {Q} ^{3}$ . In contrast $(1,{\sqrt {2}},-3)^{T}$ is not a vector from $\mathbb {Q} ^{3}$ , because in the second component with ${\sqrt {2}}$ there is a non-rational (or also called irrational) number in the tuple.

Relation to polynomials

Above we used vectors of $\mathbb {R} ^{n}$ in tuple notation to describe systems with $n$ units of information. We find the structure of computing with tuples elsewhere as well. Consider the polynomial of degree 2 (a quadratic polynomial), given by $f(x)=7x^{2}+3x-2={\color {Red}7}\cdot x^{2}+{\color {OliveGreen}3}\cdot x^{1}+({\color {NavyBlue}-2})\cdot x^{0}$ . We always sort the summands such that the exponents are ordered descending from the degree of the polynomial 2 to $0$ . In doing so, we note that this polynomial has similarities to the vector $({\color {Red}7},{\color {OliveGreen}3},{\color {NavyBlue}-2})^{T}$ . Here the first coefficient of the polynomial is in the first component of the vector and so on. Basically, the vector encodes the polynomial.

We can observe the same similarity between addition and scalar multiplication of polynomials on the one hand and the associated operations of vectors on the other. Let us take the polynomials $f:\mathbb {R} \to \mathbb {R}$ with $f(x)=7x^{2}+3x-2$ and $g:\mathbb {R} \to \mathbb {R}$ with $g(x)=x^{2}+6$ and the scalar $\rho =-1$ . We can write the polynomials as tuples:

{\begin{array}{cccc}f(x)&=&{\color {Red}7}x^{2}&+{\color {OliveGreen}3}x&+({\color {NavyBlue}-2})&\leftrightarrow &({\color {Red}7},{\color {OliveGreen}3},{\color {NavyBlue}-2})^{T}\\g(x)&=&{\color {Red}1}x^{2}&+{\color {OliveGreen}0}x&+{\color {NavyBlue}6}&\leftrightarrow &({\color {Red}1},{\color {OliveGreen}0},{\color {NavyBlue}6})^{T}\end{array}}

Now we calculate $f(x)+g(x)$ in both forms of representation:

{\begin{array}{ccc}{\color {Red}7}x^{2}&+{\color {OliveGreen}3}x&+({\color {NavyBlue}-2})&\leftrightarrow &({\color {Red}7},{\color {OliveGreen}3},{\color {NavyBlue}-2})^{T}\\&+&&&+\\{\color {Red}1}x^{2}&+{\color {OliveGreen}0}x&+{\color {NavyBlue}6}&\leftrightarrow &({\color {Red}1},{\color {OliveGreen}0},{\color {NavyBlue}6})^{T}\\&=&&&=\\{\color {Red}8}x^{2}&+{\color {OliveGreen}3}x&+{\color {NavyBlue}4}&\leftrightarrow &({\color {Red}8},{\color {OliveGreen}3},{\color {NavyBlue}4})^{T}\end{array}}

Also the multiplication of $f(x)$ with the factor $\rho$ corresponds with the respective calculation in the associated vector tuples:

{\begin{array}{ccc}&-1&&&-1\\&\cdot &&&\cdot \\{\color {Red}7}x^{2}&+{\color {OliveGreen}3}x&+({\color {NavyBlue}-2})&\leftrightarrow &({\color {Red}7},{\color {OliveGreen}3},{\color {NavyBlue}-2})^{T}\\&=&&&=\\{\color {Red}-7}x^{2}&+({\color {OliveGreen}-3})x&+{\color {NavyBlue}2}&\leftrightarrow &({\color {Red}-7},{\color {OliveGreen}-3},{\color {NavyBlue}2})^{T}\\\end{array}}

Every second degree polynomial can be uniquely represented by a three-dimensional vector in the way described. Conversely, every three-dimensional vector uniquely describes a second-degree polynomial. Thus we find a bijective map between the set of second degree polynomials and the $\mathbb {R} ^{3}$ . Similarly, there exists a bijective map between third degree polynomials and the $\mathbb {R} ^{4}$ and in general between polynomials of $n$ -th degree and the $\mathbb {R} ^{n+1}$ .

So far we have allowed as coefficients for polynomials all real numbers. We can also consider polynomials whose coefficients are elements of $\mathbb {Q}$ . Accordingly, the entries of the corresponding vector are rational numbers. polynomials $n$ -th degrees with rational coefficients thus correspond to vectors from the vector space $\mathbb {Q} ^{n+1}$ . Actually, instead of $\mathbb {R}$ or $\mathbb {Q}$ , any field is allowed.

General vector spaces in mathematics

We have found that we can calculate with polynomials of degree $n$ in the same way as with vectors of $K^{n+1}$ . Thus, the set of polynomials of degree $n$ has a similar structure compared to $K^{n+1}$ . However, when considering all polynomials, that is, polynomials of any degree, we reach our limits with the notion of the $K^{n}$ . In this set the polynomials can have arbitrary large exponents:

{\begin{aligned}p_{1}(x)&=x^{\color {OliveGreen}1}\\p_{2}(x)&=x^{\color {OliveGreen}2}+x^{1}\\p_{3}(x)&=x^{\color {OliveGreen}3}+x^{2}+x^{1}\\&\vdots \end{aligned}}

To describe this set by tuples, we need infinitely many entries. The space of all polynomials includes infinitely many dimensions, while in $K^{n}$ we are limited to $n$ dimensions. Thus the set of all polynomials cannot be expressed by a set $K^{n}$ . Nevertheless, polynomials and tuples have a common structure, as we have already seen. This allows a further step of abstraction: by summarizing this common structure in a definition, we can talk about tuples as well as polynomials and about other sets with these structures.

What is this common structure? The commonality of polynomials and of tuples is that they can be added and scaled and that both operations behave similarly on both sets. This is the common structure that vector spaces have: vectors are objects that can be added and scaled.

We have noted a structural difference between the $K^{n}$ and the vector space of all polynomials. However, they have in common that their elements can be added and scaled. Thus it seems obvious to consider this property of vectors as the defining property of an every vector space.

Up to now we have not considered which calculation rules apply to the addition and scalar multiplication of vectors in general vector spaces. In $\mathbb {R}$ we have the associative and commutative law as well as the distributive law and we know neutral and inverse elements concerning addition and multiplication. As we have seen above, arithmetic in $\mathbb {R} ^{n}$ can be traced back to arithmetic in $\mathbb {R}$ . Accordingly certain calculation rules of the real numbers transfer to the vector space $\mathbb {R} ^{n}$ and analogously of every field $K$ to the $K^{n}$ .

Deriving the definition of a vector space

The addition, scalar multiplication and all associated arithmetic laws provide the formal definition of the vector space. The starting point of our description of a vector space is a set $V$ containing all vectors of a vector space. In order for our vector space $V$ to contain at least one vector, we require that $V$ has to be non-empty. We have seen that the essential structure of a vector space is given by the arithmetic operations performed on it. So we need to formally describe addition and scalar multiplication on a vector space.

The additive structure of a vector space

We have already required that a vector space $V$ should be a non-empty set. Now we define via axioms what properties its additive structure must have. First, we note that an addition of vectors is an inner operation ^[1] $\boxplus :V\times V\to V$ . So it is a map where two vectors are mapped to another vector. The function value is the sum of the two input vectors.

We denote this map with the symbol $\boxplus$ . So $\boxplus (v,w)$ is the sum of the two vectors $v$ and $w$ . The notation $\boxplus (v,w)$ is analogous to the notation $f(v,w)$ , where instead of " $f$ " we write the symbol " $\boxplus$ ". Instead of the notation $\boxplus (v,w)$ the so-called infix notation $v\boxplus w$ is usually used, which we want to use in the following.

We use here the operation sign " $\boxplus$ " to better distinguish between the vector addition and of addition of numbers " $+$ ", which we can first consider independently. In most textbooks, the symbol " $+$ " is also used for vector addition. Whether the addition of vectors or of numbers is meant, must be inferred from the respective context.For convenience, we will also later use the symbol " $+$ " instead of " $\boxplus$ ".

To show that the set $V$ is provided with an operation " $\boxplus$ ", we write $(V,\boxplus )$ . However, in order for us to consider " $\boxplus$ " as an addition, this operation must satisfy certain characteristic properties that we already know from the addition of numbers. These are:

$V$ is complete with respect to $\boxplus$ . That means, the sum of two vectors again yields a well-defined vector:
$\forall v,w\in V:v\boxplus w\in V$
Die vector addition is commutative ( $\boxplus$ satisfies the commutative law):
$\forall v,w\in V:v\boxplus w=w\boxplus v$
Die vector addition is associative ( $\boxplus$ satisfies the associative law):
$\forall v,w,z\in V:(v\boxplus w)\boxplus z=v\boxplus (w\boxplus z)$
The vector addition has a neutral element. This means that there is at least one vector $e\in V$ for which
$\forall v\in V:v\boxplus e=e\boxplus v=v$

Later we will show that it already follows from the other axioms that every vector space has exactly one neutral element. This neutral element $e$ is called the zero vector. For the zero vector from the vector space $V$ we write " $0_{V}$ ". If it is clear which vector space the zero vector comes from, then we write down " $0$ ".
For every vector $v$ there exists at least one additive inverse element $i\in V$ . For the vector $i$ inverse to $v$ we have that:
$v\boxplus i=i\boxplus v=e$

This means that the addition of every vector with its (additive) inverse must yield the neutral element $e$ or, in other words, the zero vector. We will show later that the inverse vector $i$ is unique. So for every vector $v$ there is exactly one inverse vector $i$ to it. We call this vector inverse or negative to $v$ and usually write " $-v$ " for it.

A set with an operation satisfying the above five axioms is also called an abelian group ^[2].

The scalar multiplication

We have already defined which properties the addition of vectors must fulfil. The scalar multiplication of vectors is still missing. So that we can distinguish the scalar multiplication of the normal number multiplication, we use for it first the symbol " $\boxdot$ ". In textbooks the symbol " $\cdot$ " is used instead of " $\boxdot$ " or the dot is even omitted completely. Which operation is meant then, results from the context. We will use this notation later. The scalar multiplication maps a number (scaling factor) and a vector to another vector.

{\color {OliveGreen}\underbrace {\rho } _{\text{scaling factor}}}\boxdot {\color {NavyBlue}\underbrace {v} _{\text{initial vector}}}={\color {NavyBlue}\underbrace {w} _{\text{scaled vector}}}

The notation $\rho \boxdot v$ means that $v$ is stretched (or compressed) by $\rho$ . It is obvious to define the scalar by $\rho \in \mathbb {R}$ . However, we can still generalize this. All sets, in which one can add and multiply similarly to the real numbers, come into question as basic set for scaling factors. Such a set is called a field (missing) in mathematics.

The properties of scalar multiplication " $\boxdot$ " are similar to those of multiplication of numbers. We now want to define scalar multiplication formally by axioms. As with of addition, a non-empty set $V$ is the starting point of the definition. In addition, we need a field $K$ . The scalar multiplication is an outer operation $\boxdot :K\times V\rightarrow V$ satisfying the following properties:

scalar distributive law:
$\forall \lambda ,\rho \in K\ \forall v\in V:(\lambda +\rho )\boxdot v=(\lambda \boxdot v)\boxplus (\rho \boxdot v)$
vectorial distributive law:
$\forall \lambda \in K\ \forall v,w\in V:\lambda \boxdot (v\boxplus w)=(\lambda \boxdot v)\boxplus (\lambda \boxdot w)$
associative law for scalars:
$\forall \lambda ,\rho \in K\ \forall v\in V:(\lambda \cdot \rho )\boxdot v=\lambda \boxdot (\rho \boxdot v)$
Let $1\in K$ be the neutral element of the multiplication in the field $K$ . Then, $1$ is also the neutral element of scalar multiplication:
$\forall v\in V:1\boxdot v=v$

In order to be able to scale vectors, we also need a field $K$ in the definition of a vector space. This field contains the scaling factors. Therefore, vector spaces $V$ are always defined over a field $K$ . We say " $V$ is a vector space over $K$ " or briefly " $V$ is a $K$ -vector space" to express that the scaling factors for $V$ come from $K$ .

Definition of a vector space

We can write down our considerations in a compressed way to get the formal definition of a vector space:

Definition (vector space)

Let $V$ be a non-empty set with an inner operation $\boxplus :V\times V\to V$ (of vector addition) and an outer operation $\boxdot :K\times V\to V$ (of scalar multiplication). The set $V$ with these two operations is called vector space over the field $K$ or alternatively $K$ -vector space if the following axioms hold:

$V$ together with the operation $\boxplus$ forms an abelian group (missing). That is, the following axioms are satisfied:
1. associative law: For all $v,w,z\in V$ we have that: $v\boxplus (w\boxplus z)=(v\boxplus w)\boxplus z$ .
2. commutative law: For all $v,w\in V$ we have that: $v\boxplus w=w\boxplus v$ .
3. Existence of a neutral element: There is an element $0\in V$ such that for all $v\in V$ we have that: $v\boxplus 0=v$ . This vector $0$ is called neutral element of addition or zero vector.
4. Existence of an inverse element: To every $v\in V$ there exists an element $y\in V$ such that we have $v\boxplus y=0$ . The element $y$ is called inverse element to $v$ . Instead of $y$ we also write $-v$ .
In addition, the following axioms of scalar multiplication $\boxdot$ must be satisfied:
1. Scalar distributive law: For all $\lambda ,\rho \in K$ and all $v\in V$ we have that: $(\lambda +\rho )\boxdot v=(\lambda \boxdot v)\boxplus (\rho \boxdot v)$ .
2. Vector distributive law: For all $\lambda \in K$ and all $v,w\in V$ we have that: $\lambda \boxdot (v\boxplus w)=(\lambda \boxdot v)\boxplus (\lambda \boxdot w)$
3. Associative law for scalars: For all $\lambda ,\rho \in K$ and all $v\in V$ it holds that: $(\lambda \cdot \rho )\boxdot v=\lambda \boxdot (\rho \boxdot v)$ .
4. Neutral element of scalar multiplication: For all $v\in V$ and for $1\in K$ (the neutral element of multiplication in $K$ ) we have that: $1\boxdot v=v$ . The 1 is called neutral element of scalar multiplication.

Instead of " $V$ " one often writes " $(V,\boxplus ,\boxdot )$ ". The last notation makes clear that the set $V$ includes the operations $\boxplus$ and $\boxdot$ .

Hint

We use the symbols " $\boxplus$ " and " $\boxdot$ " to distinguish them from addition " $+$ " and multiplication " $\cdot$ ". In the literature this distinction is often not made and from the context it becomes clear whether for example " $+$ " means an addition of numbers or of vectors.

References

↑ see also operation (missing)
↑ see abelian group

Vector space →

Feedback? Do you want to join?

If you have questions concerning the content, or didn't understand something, the feel free to contact us! We would love to answer your questions! Also we are thankful for critics and/or comments! If you share our vision to explain university math in an comprehensible way, then contact us under:

E-Mail: en@serlo.org

This article is licensed under the free license CC-BY-SA 3.0. With that you can use it, modify it or share it freely, as long as you name „Serlo“ as source and put you changes under the same CC-BY-SA 3.0 oder an compatible license. On the page „Kopier uns!“ we explain you what you have to pay attention to, when using our texts, picture or videos.

[1] see also operation (missing)

[2] see abelian group

[1]

[2]