In this video, Vi Hart talks about the multiplication scale, which gives us logarithms. She speaks quickly and poetically; don’t expect to understand all of it at this point. If you just want to hear the part about logarithms (which applies to the slide rule that I was discussing today), jump to the 5:00 mark.

# Category Archives: In Depth

# The History of Imaginary Numbers

In this video, Barry Mazur discusses an early appearance of imaginary numbers in mathematical history, and how the mathematician responded to it.

I highly recommend all Numberphile videos as interesting explorations on mathematics and mathematical history.

# Simplifying Radicals on the TI-84 CE

Unfortunately, there doesn’t seem to be a pre-existing function on the TI-84 CE to present a simplified radical. You can write a program, which I provide here, but this takes a lot of work to simply enter into the calculator that I don’t advise it. However, I’m presenting it here to give you an idea of how to program the calculator, in case you’re interested.

To create a program of your own, including entering this one, press the **prgm** button, then select **NEW** and **1:Create New**. Give it a name (I called this **SIMPRAD** for “Simplify Radical”).

Then enter the code below, not including the line-initial colons. These colons represent the start of a new line.

:Input "RADICAND? ",D :1→C :"+"→Str3 :If D<0 :Then :-D→D :"i"→Str3 :End :If D>0 and fPart(D)=0 :Then :While fPart(D/4)=0 :C*2→C :D/4→D :End :For(E,3,√(D),2) :While fPart(D/E^2)=0 :C*E→C :D/E^2→D :End :End :"? :For(A,1,1+log(C)) :sub("0123456789",ipart(10*fPart(C*10^(-A)))+1,1)+Ans :End :sub(Ans,1,length(Ans)-1→Str1 :"? :For(A,1,1+log(D)) :sub("0123456789",ipart(10*fPart(D*10^(-A)))+1,1)+Ans :End :sub(Ans,1,length(Ans)-1→Str2 :If Str3="i" :Str1+Str3→Str1 :If D>1 :Str1+"√("+Str2+")"→Str1 :Disp Str1 :Else :If D=0 :Then :Disp "0" :Else :Disp "INVALID" :End :End

I won’t go through all the entry details; some of the characters can be entered from keys on the calculator, while others require going to specific menus. If you really do want to enter this into your calculator, search around or ask me for specific items.

Let’s look at how each section of this code works.

:Input "RADICAND? ",D :1→C :"+"→Str3

The lines above ask the user for the number to be simplified. For instance, if you want to simplify \(\sqrt{412}\), you would enter **412**. When the program is done, **D** will hold the radicand and **C** will hold the coefficient. **Str3** will let us know if the initial radicand is negative.

:If D<0 :Then :-D→D :"i"→Str3 :End

The lines above allow for imaginary roots.

:If D>0 and fPart(D)=0 :Then

We will only process positive integers this way; 0 and non-integers will be handled separately.

:While fPart(D/4)=0 :C*2→C :D/4→D :End

There are two approaches we could use: Have a list of primes that we walk through, or test 2 and then all odd integers greater than 1. For ease of programming, I’ll do the latter. So these lines divide the radicand by \(2^2 = 4\) until doing so results in a non-integer.

:For(E,3,√(D),2) :While fPart(D/E^2)=0 :C*E→C :D/E^2→D :End :End

These lines divide the radicand by \(3^2 = 9\), \(5^2 = 25\), and so on up to the square root of the radicand, moving on to each new odd number when dividing results in a non-integer.

At this point, we have what we need. If we were willing to have ugly output, we could pretty much stop here. Most of the rest of the code is to make the output attractive. Because the TI-84 CE couldn’t easily convert a number into a string (characters on a screen that aren’t numbers), and couldn’t connect a number to a string, we have to do this. A recent OS update changed this, but I’m providing code that works for all the calculators we have in the room.

:"? :For(A,1,1+log(C)) :sub("0123456789",ipart(10*fPart(C*10^(-A)))+1,1)+Ans :End :sub(Ans,1,length(Ans)-1→Str1

The lines above convert the coeefficient from the number **C** into the string **Str1**.

:"? :For(A,1,1+log(D)) :sub("0123456789",ipart(10*fPart(D*10^(-A)))+1,1)+Ans :End :sub(Ans,1,length(Ans)-1→Str2

The lines avove convert the radicand into **Str2**.

:If Str3="i" :Str1+Str3→Str1 :If D>1 :Str1+"√("+Str2+")"→Str1 :Disp Str1

The lines above create a string like **4i√3**. The rest of the code handles 0 (in which case, just display 0) and non-integers (in which case, display “INVALID”).

:Else :If D=0 :Then :Disp "0" :Else :Disp "INVALID" :End :End

I think this gives an interesting overview to programming the TI 84. If you have a personal calculator and want to store this, feel free.

# Deriving the Quadratic Formula

So where does the quadratic formula come from, anyway?

The formula gives us the solutions of a quadratic equation of the form \[ax^2 + bx + c = 0\] It tells us that this equation is true when \[x = \frac{-b\pm\sqrt{b^2 – 4ac}}{2a}\]

But where did such a strange formula come from?

It comes from solving the quadratic equation for \(x\), but that requires some substitution. We can’t directly solve an equation that contains the same variable to different powers. Technically, \(x^2\) is a different variable than \(x\).

Instead, we need to rewrite the quadratic equation into a form that only has \(x\) one time. Consider the expression \((dx + e)^2\): This satisfies that condition. We can solve \((dx + e)^2 – f = 0\) in terms of \(x\): \[(dx + e)^2 – f = 0 \Rightarrow \\ (dx + e)^2 = f \Rightarrow \\ dx + e = \pm \sqrt{f} \Rightarrow \\ dx = -e \pm \sqrt{f} \Rightarrow \\ x = \frac{-e \pm \sqrt{f}}{d} = \frac{-e}{d} \pm \frac{\sqrt{f}}{d} \]

This looks similar to the quadratic formula, but simpler. If we could find a way to rewrite \(d\), \(e\), and \(f\) in terms of \(a\), \(b\), and \(c\), we’d be all set.

And we can do that. Let’s assume that the two forms of the quadratic equation represent an identical function, that is, \[\color{red}{a}x^2 + \color{blue}{b}x + \color{green}{c} = (dx + e)^2 – f\] If we expand the right hand side, we get \[(dx + e)^2 – f = \color{red}{d^2}x^2 + \color{blue}{2de}x + \color{green}{e^2 – f}\] This means: \[\color{red}{a = d^2} \\ \color{blue}{b = 2de} \\ \color{green}{c = e^2 – f}\]

The first line is straightforward: \(d = \sqrt{a}\) (we’ll assume that d is positive; we could still derive the formula without this assumption, but it’s more confusing).

The second line then becomes \(b = 2e\sqrt{a}\), so \(e = \frac{b}{2\sqrt{a}}\).

The third line then becomes \(c = \frac{b^2}{4a} – f\), so \(f = \frac{b^2}{4a} – c = \frac{b^2 – 4ac}{4a} \).

At this point, we can substitute each variable. Start with \(\color{red}{f}\): \[ \frac{-e}{d} \pm \frac{\sqrt{\color{red}{f}}}{d} = \frac{-e}{d} \pm \frac{\sqrt{\color{red}{\frac{b^2 – 4ac}{4a}}}}{d} \\ = \frac{-e}{d} \pm \frac{\sqrt{\color{red}{b^2 – 4ac}}}{\color{red}{2}d\color{red}{\sqrt{a}}} \]

Next, replace \(\color{red}{e}\): \[\frac{-\color{red}{e}}{d} \pm \frac{\sqrt{b^2 – 4ac}}{2d\sqrt{a}} = \frac{-\color{red}{\frac{b}{2\sqrt{a}}}}{d} \pm \frac{\sqrt{b^2 – 4ac}}{2d\sqrt{a}} \\ = \frac{-\color{red}{b}}{\color{red}{2}d\color{red}{\sqrt{a}}} \pm \frac{\sqrt{b^2 – 4ac}}{2d\sqrt{a}} = \frac{-b \pm \sqrt{b^2 – 4ac}}{2d\sqrt{a}}\]

Finally, replace \(\color{red}{d}\): \[\frac{-b \pm \sqrt{b^2 – 4ac}}{2\color{red}{d}\sqrt{a}} = \frac{-b \pm \sqrt{b^2 – 4ac}}{2\color{red}{\sqrt{a}}\sqrt{a}} = \frac{-b \pm \sqrt{b^2 – 4ac}}{2a}\]

Recall that we started with an equation involving \(d\), \(e\), and \(f\) which represented the values of \(x\) that made the equation true. That is, \[x = \frac{-e \pm \sqrt{-f}}{d} = \frac{-b \pm \sqrt{b^2 – 4ac}}{2a}\] which is the quadratic formula in its typical form.

# Multiplying complex numbers

Today I discussed what it looks like when we multiply complex numbers on the plane. In this entry, I’m going to give some more examples of how that works.

We already know how to multiply complex numbers algebraically. For example, what is the product of \(z_1 = (4 – 2i)\) and \(z_2 = (3 + i)\)? First we use the distributive law: \(z_3 = (4 – 2i)(3 + i) = 12 + 4i – 6i – 2i^2\). We replace \(i^2\) with \(-1\) then simplify this to \(z_3 = 14 – 2i\).

Let’s look at these three numbers on the complex number plane.

On the standard complex plane, there doesn’t seem to be much relationship between the two multiplicands and their product. The product is farther away, which we would expect, but it’s not clear why it’s located where it is.

First, let’s look at the absolute values of each point.

- \(|z_1| = |4-2i| = \sqrt{16 + 4} = \sqrt{20} = 2\sqrt{5}\)
- \(|z_2| = |3+i| = \sqrt{9 + 1} = \sqrt{10}\)
- \(|z_3| = |13-2i| = \sqrt{196+4} = \sqrt{200} = 10\sqrt{2} \)

The product of the absolute values of the multiplicands is the absolute value of the product. This is the first property of the product of complex numbers.

This explains why \(z_3\) has the distance that it does, but not why it’s located where it is. For this, we need to look at how much each point is rotated from the positive real axis (the x-ray).

We can create three right triangles by connecting each point to the x-ray and to the point \(0\). For instance, looking at the triangle formed by \(z_2\), it has legs of \(3\) and \(1\) and a hypotenuse of \(\sqrt{10}\). This means the angle at the lower left corner is \(\tan^{-1}{1/3} \approx 18.43^o\).

Using the inverse tangents, we can calculate the angle of rotation for each point. Here’s an important detail: If we’re rotating counterclockwise, we’ll call it a positive angle; if we’re rotation clockwise (as with \(z_1\) and \(z_3\)), we’ll call it a negative angle.

Here are the respective angles of rotation, approximately:

- \(z_1: -26.57^o\)
- \(z_2: 18.43^o\)
- \(z_3: -8.13^o\)

What do we get if we add the angles of rotation for \(z_1\) and \(z_2\)? It’s pretty close to what we got for \(z_3\); the difference is because we rounded the values.

So here’s the second property of the product of complex numbers: The angle of rotation for the product will be the sum of the angles of rotations of the multiplicands.

Let’s do another example. Take \(z_1 = (-1 + i)\) and \(z_2 = (1 – i)\). Then \(z_3 = -1 + i + i – i^2 = 2i\). Here are the respective absolute values:

- \(|z_1| = \sqrt{2}\)
- \(|z_2| = \sqrt{2}\)
- \(|z_3| = 2\)

and the angles of rotation:

- \(z_1: 135^o\)
- \(z_2: -45^o\)
- \(z_3: 90^o\)

… which follows the pattern we’ve established: The absolute values are multiplied, the angles of rotation are added.

### Key consequences

In general, this is a fun observation that might help you understand what multiplication of complex numbers means. But it’s a very powerful observation to explain two key mathematical truths.

The first and more important of these is that *the product of two negative real numbers is positive*. Because all **negative** real numbers are on the negative portion of the real number line, they have an angle of rotation of \(180^o\) from the x-ray. When we multiply two negative real numbers, the angle of rotation of the product will be \(360^o\), that is, the product will be on the x-ray.

The second is that *the product of a non-zero complex number and its conjugate will always be a positive real number*. The conjugate of a complex number is formed by keeping the real part the same and taking the opposite of the imaginary part.

Let’s look at \(z_1 = 1+2i\) and its conjugate \(z_2 = 1-2i\).

If we create our triangles, we see two congruent triangles (two right triangles with congruent legs). Hence the angle of rotation for \(z_2\) is the clockwise equivalent of \(z_1\)’s, and the angle of rotation for their product with be \(0^o\). A non-zero complex number with an angle of rotation of \(0^o\) is a positive real number.

### Advanced section! (Here be dragons)

Incidentally, there is a different way of giving complex numbers. Rather than giving a real part and an imaginary part, we could instead state the absolute value and the angle of rotation. These are called polar coordinates. Our class calculator (the TI-84) even has a setting that lets us work with them.

If you set the calculator to \(a + bi\), you will be working with complex numbers in the way we’ve discussed in class. This is the most common way to work with complex numbers. However, if you set the calculator to \(re^{\theta i}\), then enter \(\sqrt{-1}\), you’ll get \(1e^{90i}\).

The number before \(e\) is the absolute value of the complex number; “r” stands for “radius”, because in this system you’re working with the radius of the circle that the point is on, and the angle of rotation of the number.

You probably haven’t met \(e\) before. This is a constant called Euler’s number that we’ll discuss later in this course.

The number between \(e\) and \(i\) is the angle of rotation. Depending on your settings, it can be given in either degrees or radians.

Please don’t set your calculator to the \(re^{\theta i}\) setting. It will confuse other users greatly.

Now, why does this work? Recall that when we take the product of complex numbers, the absolute value is the product of the multiplicands, while the angles are the sum. Let’s look at \(re^{\theta i}\) for each of our complex number multiplicands.

- \(z_1 = r_1e^{\theta_1 i}\)
- \(z_2 = r_2e^{\theta_2 i}\)
- \(z_3 = z_1\cdot z_2 = r_1e^{\theta_1 i} r_2e^{\theta_2 i}\)

What is the rule for multiplying when we have the same number (\(e\)) to different powers? We ** add** the powers! So \(z_3 = r_1r_2 e^{(\theta_1 + \theta_2) i}\). This is exactly what we want: Multiply the absolute values (r) and add the angles of rotation (\(\theta\)).

# Simplifying Square Roots

Because square roots are usually irrational, we generally don’t want to convert them into decimal form until the very last step, if at all. However, it is typical to simplify square roots by taking out any perfect square factors. To do this, we need to identify these.

One method is to create factor trees. The method in this post, however, involves creating a list of prime factors. We test the target number against each prime number in turn. The first few prime numbers are 2, 3, 5, 7, 11, and 13.

For instance, what are the prime factors of 924?

Can 924 be divided by 2? Yes, 924/2 = 462. Can 462 be divided by 2? Yes, 462/2 = 231.

231 can’t be divided by 2, so now we try 3. 231/3 = 77.

77 can’t be divided by 3, so we try 5. That doesn’t work either, so we try 7: 77/7 = 11, which we also know is a prime.

This gives us our list of factors: 924 = 2 * 2 * 3 * 7 * 11. Of these, only 2 * 2 represents a perfect square, so \(\sqrt{924} = 2\sqrt{231}\).

When we’re testing prime numbers, it’s important to know when we can stop. Let’s say we want to know if 113 is prime. Do we need to test all numbers smaller than 101? That’s a lot of work.

It turns out we only need to test all the prime numbers up to \(\sqrt{113}\). Why is this?

To see why, look at 115. The prime factors of 115 are 5 and 23. While there is a prime factor that is greater than \(\sqrt{115}\), there is also a prime factor less than it.

In general, if a number \(n\) is composite, it has at least two factors, \(a\) and \(b\). Since \(n = ab\), then \(a = n/b\). Let \(m = \sqrt{n}\), so \(n = m\cdot m\) and \(m = n/m\). If \(a > m\), then \(n/b > n/m\).

Solve this for \(b\): \(n > nb/m \Rightarrow nm > nb \Rightarrow m > n\).

In other words, if there is a factor that is greater than \(\sqrt{n}\), there is another factor that is less than \(\sqrt{n}\). So we only need to try the prime numbers less than the square root of a number to see if it’s prime.

For 113 specifically: Since 113 is less than 121, we only need to test prime numbers less than 11, that is, 2, 3, 5, and 7. None of these are factors of 113, so we can conclude that 113 is prime (which it is).

Since 17 * 17 = 289, that means that the list of primes provided above (2, 3, 5, 7, 11, 13) are enough to test any number less than 289 for factors. Adding 17 and 19 to the list lets us test less that 529.

# Multiplying Matrices

Multiplying matrices can be confusing, but if you’re organized and disciplined, it’s not difficult. Just make sure to keep things straight.

Let’s multiply two matrices: \[\color{red}{A = \begin{bmatrix} 5 & 1 & 3 \\ 4 & 2 & -1 \end{bmatrix}} \quad \color{blue}{B = \begin{bmatrix} 8 & 11 \\ -6 & 7 \\ 0 & 9 \end{bmatrix}}\]

First, think about \(C = AB\). Can we multiply this? We start by deciding what the size of the product matrix \(C\) will be. Since the dimensions of \(A\) are \(2 \times 3\) and the dimensions of \(B\) are \(3 \times 2\), the dimensions of \(C\) will be \((2 \times 3)(3 \times 2) = 2 \times 2\).

We can only create a product matrix if the number of columns of the first matrix is the same as the number of rows in the second matrix. The size of the product matrix will be the number of rows of the first matrix and the number of columns of the second matrix.

We create a blank 2 x 2 matrix, with enough room to fill in the values: \[\color{green}{C = \begin{bmatrix} \_\_\_\_ & \_\_\_\_ \\ \_\_\_\_ & \_\_\_\_ \end{bmatrix}}\]

We look at the first matrix (\(A\)) in terms of rows. We will put the values from the first row of \(A\) in the first row of \(C\), and the same for the second row. Since there are three elements in each row of \(A\), we will create three terms in each element of \(C\). That is: \[\color{green}{C = \begin{bmatrix} \color{red}{5}\cdot\_+\color{red}{1}\cdot\_+\color{red}{3}\cdot\_ & \color{red}{5}\cdot\_+\color{red}{1}\cdot\_+\color{red}{3}\cdot\_ \\ \color{red}{4}\cdot\_+\color{red}{2}\cdot\_+\color{red}{-1}\cdot\_ & \color{red}{4}\cdot\_+\color{red}{2}\cdot\_+\color{red}{-1}\cdot\_ \end{bmatrix}}\]

Make sure to leave the blanks! At this point, we’re focusing only on the first matrix, but we want to make sure to leave space for the numbers from the second matrix.

Now we look at the second matrix (\(B\)) in terms of columns. We will put the values from the first column of \(B\) in the first column of \(C\), and so on for the other column. This will give us: \[\color{green}{C = \begin{bmatrix} \color{red}{5}\cdot\color{blue}{8}+\color{red}{1}\cdot\color{blue}{-6}+\color{red}{3}\cdot\color{blue}{0} & \color{red}{5}\cdot\color{blue}{11}+\color{red}{1}\cdot\color{blue}{7}+\color{red}{3}\cdot\color{blue}{9} \\ \color{red}{4}\cdot\color{blue}{8}+\color{red}{2}\cdot\color{blue}{-6}+\color{red}{-1}\cdot\color{blue}{0} & \color{red}{4}\cdot\color{blue}{11}+\color{red}{2}\cdot\color{blue}{7}+\color{red}{-1}\cdot\color{blue}{9} \end{bmatrix}}\]

To reiterate: We fill in the values from each ROW of \(A\) in every element of the matching ROW of \(C\), and the values from each COLUMN of \(B\) in every element of the matching COLUMN of \(C\). At this point, let’s get rid of the color and see what we have: \[C = \begin{bmatrix} 5\cdot8+1\cdot-6+3\cdot0 & 5\cdot11+1\cdot7+3\cdot9 \\ 4\cdot8+2\cdot-6-1\cdot0 & 4\cdot11+2\cdot7-1\cdot9 \end{bmatrix}\]

Finally, we evaluate each element of the matrix for our solution: \[C = \begin{bmatrix} 34 & 89 \\ 20 & 49 \end{bmatrix}\]

Let’s try it the other way: \(D = BA\). What will be the size of \(D\)? Since the dimensions of \(B\) are \(3 \times 2\) and the dimensions of \(A\) are \(2 \times 3\), the dimensions of \(D\) will be \((3 \times 2)(2 \times 3) = 3 \times 3\).

We create a blank 3 x 3 matrix: \[\color{purple}{D = \begin{bmatrix} \_\_\_\_ & \_\_\_\_ & \_\_\_\_ \\ \_\_\_\_ & \_\_\_\_ & \_\_\_\_ \\ \_\_\_\_ & \_\_\_\_ & \_\_\_\_ \end{bmatrix}}\]

This time, we look at \(B\) in terms of rows. We put the values from the first row of \(B\) in the first row of \(D\), and so on for the other two rows, just as before. Since there are two elements in each row of \(B\), we will create two terms in eacn element of \(D\). That is: \[\color{purple}{D = \begin{bmatrix} \color{blue}{8}\cdot\_ + \color{blue}{11}\cdot\_ & \color{blue}{8}\cdot\_ + \color{blue}{11}\cdot\_ & \color{blue}{8}\cdot\_ + \color{blue}{11}\cdot\_ \\ \color{blue}{-6}\cdot\_ + \color{blue}{7}\cdot\_ & \color{blue}{-6}\cdot\_ + \color{blue}{7}\cdot\_ & \color{blue}{-6}\cdot\_ + \color{blue}{7}\cdot\_ \\ \color{blue}{0}\cdot\_ + \color{blue}{9}\cdot\_ & \color{blue}{0}\cdot\_ + \color{blue}{9}\cdot\_ & \color{blue}{0}\cdot\_ + \color{blue}{9}\cdot\_ \end{bmatrix}}\]

Now, we look at \(A\) in terms of columns, placing the values of each column in the blanks in \(D\): \[\color{purple}{D = \begin{bmatrix} \color{blue}{8}\cdot\color{red}{5} + \color{blue}{11}\cdot\color{red}{4} & \color{blue}{8}\cdot\color{red}{1} + \color{blue}{11}\cdot\color{red}{2} & \color{blue}{8}\cdot\color{red}{3} + \color{blue}{11}\cdot\color{red}{-1} \\ \color{blue}{-6}\cdot\color{red}{5} + \color{blue}{7}\cdot\color{red}{4} & \color{blue}{-6}\cdot\color{red}{1} + \color{blue}{7}\cdot\color{red}{2} & \color{blue}{-6}\cdot\color{red}{3} + \color{blue}{7}\cdot\color{red}{-1} \\ \color{blue}{0}\cdot\color{red}{5} + \color{blue}{9}\cdot\color{red}{4} & \color{blue}{0}\cdot\color{red}{1} + \color{blue}{9}\cdot\color{red}{2} & \color{blue}{0}\cdot\color{red}{3} + \color{blue}{9}\cdot\color{red}{-1} \end{bmatrix}}\]

Without the color, this is: \[D = \begin{bmatrix} 8\cdot 5 + 11\cdot 4 & 8\cdot 1 + 11 \cdot 2 & 8 \cdot 3 + 11 \cdot -1 \\ -6 \cdot 5 + 7 \cdot 4 & -6 \cdot 1 + 7 \cdot 2 & -6 \cdot 3 + 7 \cdot -1 \\ 0 \cdot 5 + 9 \cdot 4 & 0 \cdot 1 + 9 \cdot 2 & 0 \cdot 3 + 9 \cdot -1 \end{bmatrix}\]

Evaluating each element gives us: \[D = \begin{bmatrix} 84 & 30 & 13 \\ -2 & 8 & -25 \\ 36 & 18 & -9 \end{bmatrix}\]

If you would like to check your work, here is an online matrix multiplication calculator. Make up some small matrices and practice multiplying them, then check your answer! *(Remember: These tools are provided so you can check your work, not to help you cheat.)*

# Thinking like a Mathematician: The Beauty of Stuckness

The Math with Bad Drawings blog is excellent overall, but I encourage my students to read this article in particular.

Andrew Wiles proved Fermat’s Last Theorem. This theorem said that if \(a, b, c, d\) are all positive integers and \(a^d + b^d = c^d\), then \(d \le 3\). That is, there are solutions for \(a^1 + b^1 = c^1\) and \(a^2 + b^2 = c^2\), but no solutions for any higher power.

This theorem went unproven for a long time. Mathematicians suspected it was true, and Fermat (who created the conjecture) claimed to have a proof. But Wiles’s proof went thousands of pages, and it’s now considered impossible that Fermat had a valid proof.

Wiles’s point in this article is that what makes a mathematician different is that they see frustrations and walls as challenges, not as reasons to give up. I hope all of my students can come to understand that.

# Mathematics without Negatives

The word “algebra” comes from the title of a book from around AD 800 by Muhammad Al-Khwarizmi. Despite this, the symbols that we associate with modern algebra (particularly, the use of single letter variable names) don’t appear in the book. Also, the conceptual field of mathematics called algebra came several centuries before: Al-Khwarizmi’s book is historically significant, but it built on previous work and the modern symbolism didn’t occur until long after.

One limitation of the book is that Al-Khwarizmi didn’t use negative numbers. This was typical of mathematicians of the era: Negative numbers were in use, but were heavily resisted by many.

He begins his book by showing three geometrical solutions to what we now call a quadratic equation, \(ax^2 + bx + c = 0\). He needs three because the limitation to positive numbers means he can’t use negative coefficients. So, rather, he shows how to solve the following:

- \(ax^2 + bx = c\)
- \(ax^2 + c = bx\)
- \(bx + c = ax^2\)

Likewise, he can’t solve for negative roots; the only time there are two solutions is when both solutions are positive.

This might seem odd to modern students, but it’s important to remember that he was providing *geometric* solutions. There are no negatives in geometry proper: All measurements are positive. From this standpoint, his approach makes perfect sense.

If you’d like to read more, my presentation of his first chapter is available on my other blog: First post; second post.

# The sum of the first n positive integers

In this article, I’ll illustrate how we can use two different strategies to develop the same formula. This relates to the toothpick problem we explored in class.

### Strategy 1

There’s a story, probably apocryphal, about the great mathematician Carl Friedrich Gauss as a young man. He was told to add the numbers 1 to 100, which he did in less than a minute. He observed that \(1 + 100 = 101\), \(2 + 99 = 101\), and so on up to \(50 + 51 = 101\). Since there were 50 equations that each added to 101, the total sum must be 5050.

We can generalize this technique to quickly add any number of positive integers. Let’s call the highest integer \(n\), and the sum \(s_n\). The sum of each pair will be \(n + 1\). If \(n\) is even, then there will be \(\frac{n}{2}\) pairs, so the sum of all values will be \[\frac{n(n+1)}{2}\]

If \(n\) is odd, the situation is a little trickier. There will be \(\frac{n-1}{2}\) pairs and a loner in the middle, of \(\frac{n + 1}{2}\). For instance, for the first nine positive integers, the pairs are \(1 + 9\), \(2 + 8\), \(3 + 7\), and \(4 + 6\), and the loner is \(\frac{9+1}{2} = 5\). To find the sum, multiply the highest integer by the number of pairs, then add in the loner: \[\frac{(n+1)(n-1)}{2} + \frac{n + 1}{2} = \frac{(n+1)(n – 1) + n + 1}{2}\] This looks daunting, but we can simplify it: \[\frac{(n+1)(n-1) + (n+1)(1)}{2} = \frac{(n+1)(n-1+1)}{2} \\ = \frac{(n+1)n}{2} \\ = \frac{n(n+1)}{2}\] This is the same formula we got when \(n\) is even, so we can use it for all cases.

This is the standard way of developing the formula \[s_n = \frac{n(n+1)}{2}\]

### Strategy 2

Let’s look at a different route, one based on the pattern of the sums. Here are the first six sums:

- \(s_1 = 1\)
- \(s_2 = 1 + 2 = 3\)
- \(s_3 = 1 + 2 + 3 = 6\)
- \(s_4 = 1 + 2 + 3 + 4 = 10\)
- \(s_5 = 1 + 2 + 3 + 4 + 5 = 15\)
- \(s_6 = 1 + 2 + 3 + 4 + 5 + 6 = 21\)

Here are means of each, \(m_n\). Recall: To find the mean, divide the sum by the number of values.

- \(m_1 = 1\)
- \(m_2 = 3/2 = 1.5\)
- \(m_3 = 6/3 = 2\)
- \(m_4 = 10/4 = 2.5\)
- \(m_5 = 15/5 = 3\)
- \(m_6 = 21/6 = 3.5\)

The pattern is obvious: When the value goes up by 1, the mean goes up by 0.5. If we double the means, we get {2, 3, 4, 5, 6, 7}, which is always one more than \(n\). From this pattern, we predict that \(m_n = \frac{n+1}{2}\).

Since the mean is equal to the sum divided by the number of values, i.e., \(m_n = \frac{s_n}{n}\), this means \(\frac{s_n}{n} = \frac{n+1}{2}\). Multiply through by \(n\) to get \[s_n = \frac{n(n+1)}{2}\] which is the same formula we developed above.

### Inductive Proof (Bonus)

To be mathematically rigorous, it’s not enough to say that we found a pattern, and so that pattern must always hold. It’s possible that the mean follows that pattern for a while, and then something happens at, say, \(m_9\) or \(m_{100}\) to break it.

To account for this possibility, mathematicians developed what is called an inductive proof. Such a proof consists of two parts:

- Show that a formula holds for some simple case, such as \(m_1\).
- Show that if a formula holds for a certain value (\(m_{k-1}\)), it holds for the next value as well (\(m_k\)).

If it always holds for the first case and for each case after the first case, then it holds for all cases.

We already know that \(m_1 = \frac{n + 1}{2}\), since that’s part of how we got the formula in the first place. We would need to show that, if \[m_{k-1} = \frac{k – 1 + 1}{2} = \frac{k}{2}\] then \[m_k = \frac{k+1}{2}\]

Let’s say we have the mean of a set of numbers. We’re going to add a new number to the set and find the new mean. Since the mean of a set of numbers is the sum divided by the count, that is, \(m = \frac{s}{n}\), the sum of the values is the mean times the count (i.e., \(s = mn\)). The new sum, including the new value \(j\), is \(s + j\). The new mean is \(\frac{s + j}{n + 1}\).

In this case, \(s_{k-1} = (k-1) m_{k-1}\), \(n = k – 1\), and \(j = k\). So the new sum is \[s_k = (k-1) m_{k-1} + k\] and the new mean is \[m_k = \frac{s_k}{k} = \frac{(k-1) m_{k-1} + k}{k}\] We want to know the value of \(m_k\) when \(m_{k-1} = \frac{k}{2}\), so we substitute appropriately, then simplify: \[m_k = \frac{(k-1)\frac{k}{2} + k}{k} \\ = \frac{(k-1)k + 2k}{2k} \\ = \frac{(k-1+2)k}{2k} \\ = \frac{k+1}{2}\]

This is what we needed to demonstrate, so our proof is complete: Since \(m_1 = \frac{1 + 1}{2} = 1\) and \(m_{k-1} = \frac{k}{2} \Rightarrow m_k = \frac{k+1}{2}\), we know that the pattern we established for the mean of the sum of consecutive positive integers always holds up.

Since that pattern holds up, we can also conclude that the formula we developed for the sum also holds up.