Multiplying Matrices

Multiplying matrices can be confusing, but if you’re organized and disciplined, it’s not difficult. Just make sure to keep things straight.

Let’s multiply two matrices: \[\color{red}{A = \begin{bmatrix} 5 & 1 & 3 \\ 4 & 2 & -1 \end{bmatrix}} \quad \color{blue}{B = \begin{bmatrix} 8 & 11 \\ -6 & 7 \\ 0 & 9 \end{bmatrix}}\]

First, think about \(C = AB\). Can we multiply this? We start by deciding what the size of the product matrix \(C\) will be. Since the dimensions of \(A\) are \(2 \times 3\) and the dimensions of \(B\) are \(3 \times 2\), the dimensions of \(C\) will be \((2 \times 3)(3 \times 2) = 2 \times 2\).

We can only create a product matrix if the number of columns of the first matrix is the same as the number of rows in the second matrix. The size of the product matrix will be the number of rows of the first matrix and the number of columns of the second matrix.

We create a blank 2 x 2 matrix, with enough room to fill in the values: \[\color{green}{C = \begin{bmatrix} \_\_\_\_ & \_\_\_\_ \\ \_\_\_\_ & \_\_\_\_ \end{bmatrix}}\]

We look at the first matrix (\(A\)) in terms of rows. We will put the values from the first row of \(A\) in the first row of \(C\), and the same for the second row. Since there are three elements in each row of \(A\), we will create three terms in each element of \(C\). That is: \[\color{green}{C = \begin{bmatrix} \color{red}{5}\cdot\_+\color{red}{1}\cdot\_+\color{red}{3}\cdot\_ & \color{red}{5}\cdot\_+\color{red}{1}\cdot\_+\color{red}{3}\cdot\_ \\ \color{red}{4}\cdot\_+\color{red}{2}\cdot\_+\color{red}{-1}\cdot\_ & \color{red}{4}\cdot\_+\color{red}{2}\cdot\_+\color{red}{-1}\cdot\_ \end{bmatrix}}\]

Make sure to leave the blanks! At this point, we’re focusing only on the first matrix, but we want to make sure to leave space for the numbers from the second matrix.

Now we look at the second matrix (\(B\)) in terms of columns. We will put the values from the first column of \(B\) in the first column of \(C\), and so on for the other column. This will give us: \[\color{green}{C = \begin{bmatrix} \color{red}{5}\cdot\color{blue}{8}+\color{red}{1}\cdot\color{blue}{-6}+\color{red}{3}\cdot\color{blue}{0} & \color{red}{5}\cdot\color{blue}{11}+\color{red}{1}\cdot\color{blue}{7}+\color{red}{3}\cdot\color{blue}{9} \\ \color{red}{4}\cdot\color{blue}{8}+\color{red}{2}\cdot\color{blue}{-6}+\color{red}{-1}\cdot\color{blue}{0} & \color{red}{4}\cdot\color{blue}{11}+\color{red}{2}\cdot\color{blue}{7}+\color{red}{-1}\cdot\color{blue}{9} \end{bmatrix}}\]

To reiterate: We fill in the values from each ROW of \(A\) in every element of the matching ROW of \(C\), and the values from each COLUMN of \(B\) in every element of the matching COLUMN of \(C\). At this point, let’s get rid of the color and see what we have: \[C = \begin{bmatrix} 5\cdot8+1\cdot-6+3\cdot0 & 5\cdot11+1\cdot7+3\cdot9 \\ 4\cdot8+2\cdot-6-1\cdot0 & 4\cdot11+2\cdot7-1\cdot9 \end{bmatrix}\]

Finally, we evaluate each element of the matrix for our solution: \[C = \begin{bmatrix} 34 & 89 \\ 20 & 49 \end{bmatrix}\]

Let’s try it the other way: \(D = BA\). What will be the size of \(D\)? Since the dimensions of \(B\) are \(3 \times 2\) and the dimensions of \(A\) are \(2 \times 3\), the dimensions of \(D\) will be \((3 \times 2)(2 \times 3) = 3 \times 3\).

We create a blank 3 x 3 matrix: \[\color{purple}{D = \begin{bmatrix} \_\_\_\_ & \_\_\_\_ & \_\_\_\_ \\ \_\_\_\_ & \_\_\_\_ & \_\_\_\_ \\ \_\_\_\_ & \_\_\_\_ & \_\_\_\_ \end{bmatrix}}\]

This time, we look at \(B\) in terms of rows. We put the values from the first row of \(B\) in the first row of \(D\), and so on for the other two rows, just as before. Since there are two elements in each row of \(B\), we will create two terms in eacn element of \(D\). That is: \[\color{purple}{D = \begin{bmatrix} \color{blue}{8}\cdot\_ + \color{blue}{11}\cdot\_ & \color{blue}{8}\cdot\_ + \color{blue}{11}\cdot\_ & \color{blue}{8}\cdot\_ + \color{blue}{11}\cdot\_ \\ \color{blue}{-6}\cdot\_ + \color{blue}{7}\cdot\_ & \color{blue}{-6}\cdot\_ + \color{blue}{7}\cdot\_ & \color{blue}{-6}\cdot\_ + \color{blue}{7}\cdot\_ \\ \color{blue}{0}\cdot\_ + \color{blue}{9}\cdot\_ & \color{blue}{0}\cdot\_ + \color{blue}{9}\cdot\_ & \color{blue}{0}\cdot\_ + \color{blue}{9}\cdot\_ \end{bmatrix}}\]

Now, we look at \(A\) in terms of columns, placing the values of each column in the blanks in \(D\): \[\color{purple}{D = \begin{bmatrix} \color{blue}{8}\cdot\color{red}{5} + \color{blue}{11}\cdot\color{red}{4} & \color{blue}{8}\cdot\color{red}{1} + \color{blue}{11}\cdot\color{red}{2} & \color{blue}{8}\cdot\color{red}{3} + \color{blue}{11}\cdot\color{red}{-1} \\ \color{blue}{-6}\cdot\color{red}{5} + \color{blue}{7}\cdot\color{red}{4} & \color{blue}{-6}\cdot\color{red}{1} + \color{blue}{7}\cdot\color{red}{2} & \color{blue}{-6}\cdot\color{red}{3} + \color{blue}{7}\cdot\color{red}{-1} \\ \color{blue}{0}\cdot\color{red}{5} + \color{blue}{9}\cdot\color{red}{4} & \color{blue}{0}\cdot\color{red}{1} + \color{blue}{9}\cdot\color{red}{2} & \color{blue}{0}\cdot\color{red}{3} + \color{blue}{9}\cdot\color{red}{-1} \end{bmatrix}}\]

Without the color, this is: \[D = \begin{bmatrix} 8\cdot 5 + 11\cdot 4 & 8\cdot 1 + 11 \cdot 2 & 8 \cdot 3 + 11 \cdot -1 \\ -6 \cdot 5 + 7 \cdot 4 & -6 \cdot 1 + 7 \cdot 2 &  -6 \cdot 3 +  7 \cdot -1 \\ 0 \cdot 5 + 9 \cdot 4 & 0 \cdot 1 + 9 \cdot 2 & 0 \cdot 3 + 9 \cdot -1 \end{bmatrix}\]

Evaluating each element gives us: \[D = \begin{bmatrix} 84 & 30 & 13 \\ -2 & 8 & -25 \\ 36 & 18 & -9 \end{bmatrix}\]

If you would like to check your work, here is an online matrix multiplication calculator. Make up some small matrices and practice multiplying them, then check your answer! (Remember: These tools are provided so you can check your work, not to help you cheat.)

Thinking like a Mathematician: The Beauty of Stuckness

The Math with Bad Drawings blog is excellent overall, but I encourage my students to read this article in particular.

Andrew Wiles proved Fermat’s Last Theorem. This theorem said that if \(a, b, c, d\) are all positive integers and \(a^d + b^d = c^d\), then \(d \le 3\). That is, there are solutions for \(a^1 + b^1 = c^1\) and \(a^2 + b^2 = c^2\), but no solutions for any higher power.

This theorem went unproven for a long time. Mathematicians suspected it was true, and Fermat (who created the conjecture) claimed to have a proof. But Wiles’s proof went thousands of pages, and it’s now considered impossible that Fermat had a valid proof.

Wiles’s point in this article is that what makes a mathematician different is that they see frustrations and walls as challenges, not as reasons to give up. I hope all of my students can come to understand that.


Mathematics without Negatives

The word “algebra” comes from the title of a book from around AD 800 by Muhammad Al-Khwarizmi. Despite this, the symbols that we associate with modern algebra (particularly, the use of single letter variable names) don’t appear in the book. Also, the conceptual field of mathematics called algebra came several centuries before: Al-Khwarizmi’s book is historically significant, but it built on previous work and the modern symbolism didn’t occur until long after.

One limitation of the book is that Al-Khwarizmi didn’t use negative numbers. This was typical of mathematicians of the era: Negative numbers were in use, but were heavily resisted by many.

He begins his book by showing three geometrical solutions to what we now call a quadratic equation, \(ax^2 + bx + c = 0\). He needs three because the limitation to positive numbers means he can’t use negative coefficients. So, rather, he shows how to solve the following:

  1. \(ax^2 + bx = c\)
  2. \(ax^2 + c = bx\)
  3. \(bx + c = ax^2\)

Likewise, he can’t solve for negative roots; the only time there are two solutions is when both solutions are positive.

This might seem odd to modern students, but it’s important to remember that he was providing geometric solutions. There are no negatives in geometry proper: All measurements are positive. From this standpoint, his approach makes perfect sense.

If you’d like to read more, my presentation of his first chapter is available on my other blog: First post; second post.

The sum of the first n positive integers

In this article, I’ll illustrate how we can use two different strategies to develop the same formula. This relates to the toothpick problem we explored in class.

Strategy 1

There’s a story, probably apocryphal, about the great mathematician Carl Friedrich Gauss as a young man. He was told to add the numbers 1 to 100, which he did in less than a minute. He observed that \(1 + 100 = 101\), \(2 + 99 = 101\), and so on up to \(50 + 51 = 101\). Since there were 50 equations that each added to 101, the total sum must be 5050.

We can generalize this technique to quickly add any number of positive integers. Let’s call the highest integer \(n\), and the sum \(s_n\). The sum of each pair will be \(n + 1\). If \(n\) is even, then there will be \(\frac{n}{2}\) pairs, so the sum of all values will be \[\frac{n(n+1)}{2}\]

If \(n\) is odd, the situation is a little trickier. There will be \(\frac{n-1}{2}\) pairs and a loner in the middle, of \(\frac{n + 1}{2}\). For instance, for the first nine positive integers, the pairs are \(1 + 9\), \(2 + 8\), \(3 + 7\), and \(4 + 6\), and the loner is \(\frac{9+1}{2} = 5\). To find the sum, multiply the highest integer by the number of pairs, then add in the loner: \[\frac{(n+1)(n-1)}{2} + \frac{n + 1}{2} = \frac{(n+1)(n – 1) + n + 1}{2}\] This looks daunting, but we can simplify it: \[\frac{(n+1)(n-1) + (n+1)(1)}{2} = \frac{(n+1)(n-1+1)}{2} \\ = \frac{(n+1)n}{2} \\ = \frac{n(n+1)}{2}\] This is the same formula we got when \(n\) is even, so we can use it for all cases.

This is the standard way of developing the formula \[s_n = \frac{n(n+1)}{2}\]

Strategy 2

Let’s look at a different route, one based on the pattern of the sums. Here are the first six sums:

  • \(s_1 = 1\)
  • \(s_2 = 1 + 2 = 3\)
  • \(s_3 = 1 + 2 + 3 = 6\)
  • \(s_4 = 1 + 2 + 3 + 4 = 10\)
  • \(s_5 = 1 + 2 + 3 + 4 + 5 = 15\)
  • \(s_6 = 1 + 2 + 3 + 4 + 5 + 6 = 21\)

Here are means of each, \(m_n\). Recall: To find the mean, divide the sum by the number of values.

  • \(m_1 = 1\)
  • \(m_2 = 3/2 = 1.5\)
  • \(m_3 = 6/3 = 2\)
  • \(m_4 = 10/4 = 2.5\)
  • \(m_5 = 15/5 = 3\)
  • \(m_6 = 21/6 = 3.5\)

The pattern is obvious: When the value goes up by 1, the mean goes up by 0.5. If we double the means, we get {2, 3, 4, 5, 6, 7}, which is always one more than \(n\). From this pattern, we predict that \(m_n = \frac{n+1}{2}\).

Since the mean is equal to the sum divided by the number of values, i.e., \(m_n = \frac{s_n}{n}\), this means \(\frac{s_n}{n} = \frac{n+1}{2}\). Multiply through by \(n\) to get \[s_n = \frac{n(n+1)}{2}\] which is the same formula we developed above.

Inductive Proof (Bonus)

To be mathematically rigorous, it’s not enough to say that we found a pattern, and so that pattern must always hold. It’s possible that the mean follows that pattern for a while, and then something happens at, say, \(m_9\) or \(m_{100}\) to break it.

To account for this possibility, mathematicians developed what is called an inductive proof. Such a proof consists of two parts:

  • Show that a formula holds for some simple case, such as \(m_1\).
  • Show that if a formula holds for a certain value (\(m_{k-1}\)), it holds for the next value as well (\(m_k\)).

If it always holds for the first case and for each case after the first case, then it holds for all cases.

We already know that \(m_1 = \frac{n + 1}{2}\), since that’s part of how we got the formula in the first place. We would need to show that, if \[m_{k-1} = \frac{k – 1 + 1}{2} = \frac{k}{2}\] then \[m_k = \frac{k+1}{2}\]

Let’s say we have the mean of a set of numbers. We’re going to add a new number to the set and find the new mean. Since the mean of a set of numbers is the sum divided by the count, that is, \(m = \frac{s}{n}\), the sum of the values is the mean times the count (i.e., \(s = mn\)). The new sum, including the new value \(j\), is \(s + j\). The new mean is \(\frac{s + j}{n + 1}\).

In this case, \(s_{k-1} = (k-1) m_{k-1}\), \(n = k – 1\), and \(j = k\). So the new sum is \[s_k = (k-1) m_{k-1} + k\] and the new mean is \[m_k = \frac{s_k}{k} = \frac{(k-1) m_{k-1} + k}{k}\] We want to know the value of \(m_k\) when \(m_{k-1} = \frac{k}{2}\), so we substitute appropriately, then simplify: \[m_k =  \frac{(k-1)\frac{k}{2} + k}{k} \\ = \frac{(k-1)k + 2k}{2k} \\ = \frac{(k-1+2)k}{2k} \\ = \frac{k+1}{2}\]

This is what we needed to demonstrate, so our proof is complete: Since \(m_1 = \frac{1 + 1}{2} = 1\) and \(m_{k-1} = \frac{k}{2} \Rightarrow m_k = \frac{k+1}{2}\), we know that the pattern we established for the mean of the sum of consecutive positive integers always holds up.

Since that pattern holds up, we can also conclude that the formula we developed for the sum also holds up.