Introduction: Matrices – Serlo

In this article, we introduce matrices as an efficient representation of linear maps. A matrix (of a linear map ) is a rectangular arrangement of elements from ("numbers") that specifies where the standard basis of is mapped by .

Derivation Bearbeiten

Let   be a field and   a linear map. We want to describe this map in the most efficient way. Since we know from the article "vector space of a linear map" that the space of linear maps from   to   has dimension  , and that   is an element of this space. So we need   numbers to describe our linear map. We are looking for a way to write down these numbers in an organized way.

Let   be the standard basis of  . Then, following the principle of linear continuation,   is already completely determined by the vectors   : If   is an arbitrary vector, we can write it as a linear combination   of the basis elements, and because of linearity we know the value  .

So we need the "data"   to describe the linear map. These data are   vectors in  . So we can write them as

 

for certain "numbers"  . This is a notation for tracking all necessary data of the linear map. But we can still make it more efficient: We just omit the " " and agree on the convention that the  -th column describes the image of the  -th basis vector:

 

To save even more space, we can also combine the entries of these vectors into a single "table", still with the image of the  -th basis vector being in the  -th column:

 

We call this "table in parenthesis" a matrix. It is the matrix associated with the linear map  .

The matrix completely determines   and it consists of   numbers as entries, which is consistent with our considerations above.

Definiton Bearbeiten

Definition (Matrix)

Let   be a field and  . Let   for all   and  . Then we call

 

an  -matrix. We denote the set of all   matrices by  .

Example (Linear map from   to  )

We consider the linear map

 

We can see that   is indeed linear in an exercise.

In the derivation we have seen that we can describe   by a matrix. We want to compute this matrix here explicitly. To do so, we need to determine the images of the standard basis vectors

 

For these,

 

Thus the three vectors

 

contain all the information of the linear map  . If we write these side by side in a table, we get the matrix

 

which represents  .

Example (Embedding  )

Let us now consider the standard embedding of   into  , that is, the linear map

 

For the vectors of the standard basis, we have

 

So the embedding   is represented by the matrix

 

Example (Reflection of   along an axis)

Let's still examine the reflection of   along the x-axis. When we mirror a vector   along the x-axis, we keep its x-component fixed and change the sign of its y-component. The reflection is thus given by

 
The first basis vector lies on the x-axis and is therefore not affected by the reflection. Formally:
 
The second basis vector is perpendicular to the x-axis and is therefore mapped to its negative. Formally:
 

As the matrix associated with this reflection, we thus obtain:

 

Matrix-Vector Multiplication Bearbeiten

Derivation Bearbeiten

We have just seen how we can represent a linear map by a matrix. Suppose, we now do not a linear map, but only its associated matrix. What does the image of an arbitrary vector under this linear map look like?

First, for simplicity, let's consider the vector space   and any linear map   be a linear map, of which we know that the associated matrix is

 

That means, we have

  and  

We want to calculate the image of an arbitrary vector   under the map  , using the entries of the matrix  .

To do so, we represent our vector as a linear combination of the standard basis vectors, i.e.

 

Now we can exploit the linearity of   and calculate:

 

By this calculation, we can describe the effect of applying a linear map   to a vector, only by using the matrix  . This calculation works for any vector and any  -matrix. To simplify the notation, let us define a "multiplication operation" for matrices and vectors:

 

We call this the "matrix-vector multiplication" and formally write it as a product. The generalization from a   to an  -matrix is given in the following exercise:

Exercise

Let   be a linear map and   the associated matrix. Find a formula to calculate the value   for a given vector   by using the entries of the matrix  .

Solution

We write   as a linear combination of the standard basis vectors: let   be the "coordinates", such that   holds. That   is the matrix associated with   means that   is satisfied for all  . Thus, it follows for   that

 

Using the sum notation, we can write the result as

 

The solution of this exercise provides us with a formula to calculate the value of a vector under a mapping, using the associated matrix. We now define   using the formula found in the solution.

Definition Bearbeiten

Definition (Matrix-Vector Multiplication)

Let   be a field   and  . Then we define

 

From another point of view this means: If we consider the matrix   as a collection of column vectors

 

then the product   is a linear combination of the columns of   with the coefficients in  , namely  .

How can you best remember how applying a matrix to a vector works? Bearbeiten

 
To apply a matrix to a vector, you need to compute "row times column".

You may perform a matrix-vector multiplication by using the rule "row times column": The first entry of the result is the first row of the matrix times the column vector. The second entry is the second row of the matrix times the column vector, etc. for larger matrices. For each "row times column" product, you multiply the related entries (first times first, second times second, etc.) and add the results.

It is important that the type of the matrix and the type of the vector match. If you have set up everything correctly so far, this should always be the case, because a linear map   includes an   matrix. You can apply this matrix to vectors of  , since rows and columns have both length  .

Reverse direction: The induced linear map Bearbeiten

We have seen that every linear map has an associated matrix. Given a linear map  , we constructed a matrix   such that  . That is, some matrices define a linear map. But do all matrices define a linear map? And if yes, what does the corresponding mapping look like?

If a matrix   is derived from a linear map  , then we can get   back from   by defining it as the map  . More generally, we can apply this rule to any matrix   and obtain corresponding a linear map  .

So let   be an   matrix. We consider  . This map is indeed linear:

 

That means, every matrix defines a linear map.

Definition (Induced linear map)

Let   be a matrix over the field  . Then the linear map

 

is called the linear map induced by the matrix  .

Thus, we now know that for each linear map there is an associated matrix, and for each matrix there is an associated linear map. For a linear map  , we call the associated matrix  . Our construction of the induced mapping is built exactly such that  . This is quite intuitive: the linear map induced by the matrix associated to a linear map   is just map   itself. We can now ask the "reverse question": If we consider the associated matrix of a linear map induced by some original matrix, is this the original matrix, again? So in mathematical terms: Is  ? The following theorem answers this question in the affirmative:

Theorem

The mappings   and   are bijections and each other's inverse. In particular,  .

Proof

To show that the two mappings are inverse to each other, it suffices to show that applying them after each other (in any of the two orders) yields the identity. This would directly imply that both mappings are bijective. So it suffices to show that   and that  . We already know that the first equation holds. So it only remains to show the second. Let   be any  -matrix. Let   be the entry in the  -th row and  -th column of   and let   be the corresponding entry of the matrix  .

By definition of   we have

 

So the  -th entry of the vector   is equal to  . That is,  .

By definition of the matrix   associated with  , the  -th column of   is equal to the image of   under  . Thus,

 

In particular, it follows for the  -th entry of   that  

Overall, we get   Since   and   were arbitrarily chosen, all entries of the two matrices are equal and indeed  

We have thus shown that matrices and linear maps are in a "one-to-one-correspondence".