I.

I’m not saying that I’m Mr. Fun or anything, but I like a good time. Well, a good quiet time, but my point is that I’m not exactly a dark and brooding personality — most of the time. I like happiness; I’m certainly not anti-joy. And yet there are some topics that, whenever they come up, make me sound like the biggest, baddest grump on the planet.

For whatever reason, mathematically speaking, it’s whimsy. I hate whimsy. It drives me up the wall.

Maybe you know what I mean. Mathematical whim is when the ultimate justification for some mathematical pursuit is a version of ‘because mathematicians — we’re just some wiiiiild and cra-zay guys!’

Mathematical whim is when you invent a new number because you can. It’s when you extend something beyond the point of reason, because why not? It’s when you sort of suggest that once you enter Math Club you’re powerful and in charge and nothing, not even reason itself, can keep you from playing this meaningless, arbitrary game with yourself…

…and there I go again, turning into Oscar the Grouch, but for real this whole thing irrationally bugs me.

The first time mathematical whim really bugged me was when I came to dislike the way math teachers typically introduce imaginary numbers. The common pedagogical move is to point to previously unsolvable equations and suggest that we invent a solution. So, up until now we haven’t had a solution to $x^2 = -100$? What. if. we. just. made. one. up.

“Oh my god, math teacher, can you actually do that? What is this? I didn’t realize doing math was so cool that you could just do whatever you want whenever you want?”

Five reasons why I dislike this exercise in mathematical whimsy:

• It’s historically false. We recognized the value of imaginary numbers when they were useful, when treating them as numbers and therefore as things you could add/subtract/multiply/divide was useful because doing that arithmetic helped you find real solutions to polynomial equations.
• It’s pedagogically false. It gives students no appreciation for why imaginary numbers are at all useful or interesting.
• It’s sociologically false. Mathematicians don’t play the role in society that this teaching suggests that they do. Mathematicians don’t get NSF funding because they’re the red-nosed court jesters of science. Mathematicians play the role they do in society because, along with the rest of paid science, the nation thinks that math is crucial to the economy and to national defense.
• It’s psychologically false. Most people pursue things for a reason.
• It’s personally false. I like things that make sense. I don’t like putting in a lot of work to understand something that we just made up because, why not? It’s not a way of thinking about math that at all connects with who I am and what I value.

In the case of imaginary numbers, this made me upset enough that I spent a lot of time trying to put together materials that expressed a different introductory vision of what these things are. And I think that non-Euclidean geometries are similarly misunderstood and mispresented to students.

And the whole thing makes me feel grumpy, and like I’m no fun at all.

II.

This thought recently came up as I’ve been studying p-adic number theory with a colleague, because this is an area of math that it is very easy to present whimsically.

Here’s a whimsical presentation: Hey, you know how we normally find the distance between two points? You know, directly, like this:

Well, what if distance worked differently? What if there are other, alternate ways of measuring the distance between two points? Maybe, like this, so that the distance between A and B is 7:

And so that sets us out on a quest to clarify what it really means to be a measure of distance, and to search for alternate ways to satisfy those conditions.

Via wikipedia, here is what those conditions might be:

In other words, distance should not be negative, your distance is only 0 to yourself, for distance the order of your points doesn’t matter, and the “direct” route can’t be longer than the “indirect” route that takes a stop at some other point along the way.

Huzzah! We can now explore alternate measures of distance. Whim, engaged.

But…wait. Both the taxicab metric and the conventional (“Euclidean”) metric are defined using the absolute value function.

The taxicab metric measures the distance between two points $(x_1, y_1)$ and $(x_2, y_2)$ as $D_t = |x_1 - x_2| + |y_1 - y_2|$.

The conventional way of measuring distance pythagorizes the taxicab terms: $D = \sqrt{|x_1 - x_2|^2 + |y_1 - y_2|^2}$.

WHIMSY TIME: WHAT IF THE ABSOLUTE VALUE WAS ENTIRELY DIFFERENT???

So, let’s do it again: let’s figure out the core qualities of what makes something an “absolute value” and then try to find weirdo, alien-planet absolute values that fit the axioms but differ from our own normy absolute value.

Here we go again, from wikipedia, here’s what it means to be an absolute value:

And now we tell our students — or, in this case, me, since I’m the student here — that this is what p-adic numbers are. They are the answer to this particular call of whim, a response to the desire to explore alternate worlds and possibilities.

P-adic distance is the distance you get when you’re using these alternate absolute values. The numbers that you create, when using these alternate ways of measuring distance, are analogous to the Real numbers (Real numbers are created with conventional aka boring absolute value), but they are awesomer: they are the p-adic numbers (the p-adic completion of the rationals).

III.

Except, what?!

On what basis can you abstract the properties of what it means to be an absolute value from the one paradigmatic case of the absolute value? Who is to say that the properties that are essential to being an absolute value aren’t just every single one of the properties of the conventional absolute value?

The whole whimsical direction doesn’t make sense.

Now what is true is that there are some VERY cool things that you can do with this wider perspective on absolute values. They really do operate like a whole family of related functions. And there is a terribly stunning theorem that says that there are only two kinds of absolute values: the familiar one, and the p-adic ones.

So all of that is cool, but it still left me unhappy. Where does this idea of what it means to be an absolute value come from? Why would anyone care about this?

And what’s especially frustrating is that I’m just not there yet. I’m in the middle of learning about all these things. I went back to some of the papers introducing p-adic numbers towards the end of the 19th century, and I wasn’t able to connect the dots between their concerns with algebraic numbers and what I’ve been reading in my text. And there’s no reason to think that I will be able to understand any of it until I persist a bit further in my learning.

Which leads me to a troubling thought: what if mathematical whimsy is a useful lie? What if it’s a shot of instant-motivation that’s necessary to get students over that initial hump? What if it’s the sort of thing that makes itself useless in time, a ladder that a successful student will throw away once they’ve reached a higher point of vantage?

Nah, forget that. And forget whimsy too. Someone should be able to tell me why we’d bother creating these alternate p-adic absolute values in a way that makes sense.

1. My view is that the absolute value is quite a different number from the real numbers. The absolute value is the size of the number, and is necesarily one of the unsigned numbers.
There is a gap which is overlooked, namely the unsignd reals.

Like

2. I definitely hear your complaints about the cloying tone of whimsy – the cheerful uselessness, the feigned zaniness. But I think it’s one approach to a pretty central process in math education, which is generalization.

We go from real to complex, from triangle trig to circle trig, from exponentiation-as-repeated-multiplication to exponentiation-as-the-operation-with-these-special-properties. Each act of generalization involves a kind of pivot, a replacement of a key definition. Numbers are no longer “things you count and measure with”; now they’re “solutions to equations.” Sine and cosine aren’t “ratios of right triangle side lengths”; now they’re “coordinates on the unit circle.” And so on. I pretty much buy into this paradigm – it seems to fit the pedagogical heuristic of “start concrete, then slowly generalize” – but the transitions can be really tough to manage.

Whimsy is one approach. Another is to ask for trust (“Just bear with me on this”). Another is to foreshadow applications (“this is how trig becomes useful in signal processing”). Another is to highlight the process of generalization itself (“we’re generalizing now – just like when you learned about negative numbers, etc”). But I’m not sure I can imagine a version of math education that avoids these awkward transitions altogether; it seems more a matter of picking the right strategy mix to navigate them.

Liked by 1 person

1. I really like your way of framing this as a dilemma of generalization.

I think all of your options make sense, and in the hands of a good teacher I certainly wouldn’t mind being told to just trust and wait. And I also like “we’re generalizing” now as a replacement for “let’s go BANANAS!!!”

I think one thing missing from your list is efficiency. That’s sort of a more historically accurate take on complex numbers — imaginary arithmetic allowed us to solve more efficiently a problem that we already cared about. I don’t know if I’ve ever experienced that as a student, but I’d sure like to experience something like that for p-adic numbers.

Like

1. Yeah, good point. I find efficiency a hard benefit to sell to students (because the effort to master a big new framework rarely feels “efficient” at first) but I suspect that’s just my lacking the right language/approach. Certainly efficiency is a big part of the story with complex numbers (and I’m sure with other generalizations, too).

Liked by 1 person

2. Rebekah Bob-Waksberg says:

This seems like it connects the two seemingly-at-odds things my 6th graders constantly hear from me about mathematicians:
1. Mathematicians are lazy (in that they/we are always looking for more efficient methods
2. Mathematicians will spend lots of time (years even!) working on one problem

Liked by 1 person

3. At least part of the story with complex numbers and efficiency is that efficiency was useful in these equation-solving contests that algebraists used to hold to advertise their mathematical wares. And efficiency was valued, if I understand properly, because of mathematics’ use for making business calculations.

That creates a beautiful, uncomfortable tension in the history of math: imaginary numbers were only acceptable because of their relationship to practical, business math.

I’m behind in my reading, but I’m trying to understand where the idea of “pure” or “playful” math comes from. And I think this is all part of the story. There’s another part that I’m just starting to understand, which is how science in general recreated itself in order to hold on to WW2 funding. That’s where some of the contemporary notions of “basic” and “applied” science seem to clarify themselves, and I think that’s part of our current way of thinking about math. But this is all fuzzy to me and there is more to read and write about.

Like

3. * I think that the efficiency offered by complex numbers in the events that caused their uptake in math historically can be seen as distinct (disjoint?) from the type of efficiency valued in business calculations. Complex numbers were explanatory. (Actually, lemme make a case for explanation as an additional way to motivate generalization [/abstraction?], to the extent that it is different from “efficiency”.)

Cardano’s cubic formula was miraculous at solving cubic equations with a single real root; however, for reasons nobody understood, it appeared to “break” (i.e. it required the “impossible” extraction of a square root of a negative) on cubics that had three real roots. Bombelli offered the idea that if you just go with the requirement to extract the square root of the negative, and you work with what you get, then Cardano’s formula still works. The flaw wasn’t in the formula itself, but in our limited vision. As Needham tells the story in Visual Complex Analysis, this is the event that began to cause mathematicians to take complex numbers seriously. In fact, Cardano had earlier played around with square roots of negatives in the same book in which he published the solution to the cubic, but it was in this completely speculative way, and it did not make waves. (I wrote about this once — search the page for “postulating a negative”.)

* A common difficulty motivating generalizations and abstractions with learners is that the historical motivations are often out of reach technically. For example, most ppl who are learning about complex numbers for the first time do not know Cardano’s cubic formula, so you can’t usually use Bombelli’s calculations to make the case for $i$. Michael, I take you to be pointing to this same issue with the $p$-adics: if you try to understand what $p$-adics are for by digging around in Kurt Hensel’s writing, then, well first of all they’re in German (for me anyway that presents a significant barrier), and secondly, it forces you to do a lot of work to understand what he was up to, which now sits in the queue front of your original intention to learn about $p$-adics. So, that doesn’t really work.

Thus, one often needs to do some sort of creative retrofitting to motivate the generalizations/abstractions, that differs from the historical motivations. To me, the whimsy motive, along with most of the rest of the stuff in Ben O’s list, strikes me as sort of a stock / shorthand device to get through this step without doing the pedagogical work of finding an intellectual path that recovers the new idea/generalization/abstraction as something there is a need for. It’s an article of faith for me that this path always exists, at any level of prior knowledge, but it is often a great deal of work to find / build it. Your work with Max Ray on complex numbers is an illustration of the kind of pedagogical work I’m talking about.

* I am of course now sorely tempted to try to chart a course for you re: the $p$-adics that will make them seem like the natural answer to a natural question. I need to exercise restraint. But let me offer a few words in the hopes of imparting some intuition / orientation.

Lemme fix $p = 7$, just to concretize.

The starting point is that thinking mod 7 (or mod whatever) is sometimes useful. E.g. since today is Wednesday, I know that 100 days from now will be a Friday, because 100 = 2 mod 7.

If you buy that thinking mod 7 is sometimes useful, then I offer you this: thinking mod 7 is a kind of “approximate thinking.” You’re choosing to ignore some info about a number, in order to zero in on other info. In the case of 100, you get to ignore 98, since it’s a multiple of 7. This is analogous to the more standard meaning of “approximate” — where you ignore stuff that is small in [conventional] absolute value — here, we are making a different decision about what we can ignore and what to keep. This is why I’m counting on you already buying the idea that mod 7 thinking is sometimes useful. If you don’t buy this, then the entire thing is a non-starter, but if you do buy it, it means you believe that sometimes you think it’s worth it to ignore multiples of 7. (And, maybe I’m jumping the gun here, but maybe just maybe, motivated by the general principle “it makes sense to ignore small things,” you think this means there is some sense in which it is fair to say a multiple of 7 is “small.”)

Now. The situations that make 7-adics make sense are any situation in which something that’s correct mod 7 is “good”, but if it’s correct mod $49 = 7^2$ it’s “better”, and if it’s correct mod $343 = 7^3$, that’s “better still”, etc. Here is where I cannot afford the time to figure out how to really justify this. The examples I know have the technical-barrier problem, so I would really have to do some work. But if you buy that thinking mod 7 is sometimes useful, then maybe it seems reasonable to you that thinking mod 49 is also sometimes useful. And meanwhile, hitting a certain mark mod 49 is automatically going to cause you to hit a certain mark mod 7 (e.g. being 2 mod 49 means you are 2 mod 7), but not vice versa (as being 9, 16, 23, 30, 37, and 44 mod 49 will also make you 2 mod 7). So a certain target mod 49 is a “tighter restriction” than that same target mod 7; in this sense if you have a target and you know you’ve hit it mod 49, then you know you have a “better approximation” than if you had merely hit it mod 7.

Lemme apply this type of thinking to a particular case: the square root of 2. We have $3^2 = 9 = 2 \mod 7$, so squinting my eyes, 3 is a square root of 2 mod 7. (“Squinting my eyes” is to emphasize the “approximation” metaphor — if I don’t care about multiples of 7, 3 is just as good as a square root of 2.) On the other hand, 3 is not a square root of 2 mod 49, since $3^2=9$ and $2$ are not the same mod 49. I.e. when I look a little more closely, I can see that I am not hitting my target perfectly. On the other hand, the error is “small” — it’s a multiple of 7 — so perhaps I can adjust my root-2 estimate by something “small” and it will now hit the target mod 49. … Yes, I can! $10^2 = 100$ is 2 mod 49. So 10 is a square root of 2 mod 49. Note that 10 is the “first order” square root of 2, namely 3, plus a “small” adjustment, namely 7. It is a “better approximation” to the square root of two than 3 was, because it’s correct up to multiples of 49, not just 7.

I can keep going. $10 = 3 + 1\cdot 7$ is not a square root of 2 mod 343, but $108 = 3 + 1\cdot 7 + 2\cdot 7^2$ is! We have $108^2 = 11664 = 2 + 34\cdot 343$, so $108$ is a square root of 2 mod 343. In some sense this is a “still better” approximation to root 2, i.e. it will satisfy a person who does not care about multiples of 343 but does care about multiples of 7 and 49 (not only those who do not care about multiples of 49 or 7).

This is some intuition behind the idea of 7-adic distance. If you believe that it’s sometimes useful to ignore multiples of something, then I’m hoping this gave a glimpse of how the passage from ignoring all 7’s to only ignoring $7^2$‘s is a kind of “zooming in” / “insisting on greater accuracy”. If this makes any sort of sense, then the 7-adic absolute value is measuring “the size of what gets ignored”, motivated by the metaphor “if I am insisting on greater accuracy, it means the thing I am ignoring is smaller.”

Like