Too Much of A Good Thing

Greg Ashman:

If more guidance makes minimally guided approaches more effective then why not use a fully guided approach? Won’t that be still more effective? It is an argument that plays out again in the book and one that offers little comfort to proponents of open-ended problem solving in high school maths classes.

But, Jordan Ellenberg:

The difference between the two pictures is the difference between linearity and nonlinearity, one of the central distinctions in mathematics…Mitchell’s reasoning is an example of false linearity—he’s assuming, without coming right out and saying so, that the course of prosperity is described by the line segment in the first picture, in which case Sweden stripping down its social infrastructure means we should do the same.

But as long as you believe there’s such a thing as too much welfare state and such a thing as too little, you know the linear picture is wrong. Some principle more complicated than “More government bad, less government good” is in effect. The generals who consulted Abraham Wald faced the same kind of situation: too little armor meant planes got shot down, too much meant the planes couldn’t fly. It’s not a question of whether adding more armor is good or bad; it could be either, depending on how heavily armored the planes are to start with. If there’s an optimal answer, it’s somewhere in the middle, and deviating from it in either direction is bad news.

Also, John Sweller:

That is not to say that there are no disadvantages to the use of worked examples. A lack of training with genuine problem-solving tasks may have negative effects on learners’ motivation. A heavy use of worked examples can provide learners with stereotyped solution patterns that may inhibit the generation of new, creative solutions to problems.

Greg’s argument is, “If a bit is good, isn’t a lot better?” But this sort of falsely linear thinking isn’t compelling, no matter what you think about direct instruction.


How did 69 turn into 29?

Last year, while reading and writing about cognitive load theory, I came across something weird that I couldn’t explain. A paragraph from Greg Ashman’s latest reminds me of this puzzle. It’s really small and inconsequential, but it’s been bugging me. Maybe you can figure it out.

He writes:

One of my PhD supervisors did an experiment in the 1980s. Undergraduates were given as series of problems. Each problem involved a starting number and a goal number. The participants had to get from the first number to the second using only two moves which they could repeat: multiply by three or subtract 29. The problems were designed so that each one was solved by alternating the steps. Although the students could generally solve the problems, very few ever worked out the rule.

Great. Multiply by three, or subtract 29.

Except you go back to that paper, and it’s actually subtract 69.

Screenshot 2016-09-11 at 6.31.37 PM.png

Where did Greg get the “subtract 29” from? I don’t know, but it could be from this piece by Sweller in 2016.

Screenshot 2016-09-11 at 6.33.29 PM.png

Anyway, totally unimportant. Completely uninteresting. But. Did he forget? Was it a typo? Did he decide — as so many before — that 69 is a funny number to talk about in classes?

If you see me and I’m looking pensive, this is probably what I’m thinking about.

Cognitive Load Theory’s Changing Take on Motivation

In 2012, John Sweller (of Cognitive Load Theory fame) sat for an interview about his work. The conversation turned to motivation, and Sweller made it very clear that motivation was beyond the scope of CLT.

“One of the issues I faced with Cognitive Load Theory is that there are at least some people out there who would like to make Cognitive Load Theory a theory of everything. It isn’t. […] It has nothing to say about important motivational factors…It’s not part of CLT.”

Later in the interview he expands on this point.

“Cognitive Load Theory works on the assumption that the students are fully engaged, fully motivated, that their attention is being directed. Cognitive Load Theory has nothing to say about a student who is staring out the window and not listening.”

When I started researching Sweller’s work, I was fascinated by these later interviews, because I saw them as conflicting with his earlier publications. I thought this represented an important shift in his thinking, one that connects to his dismissal of “germane load” from his theory.

That’s what I thought when I wrote the essay. But does the claim hold up?

The first time Sweller writes about motivation is in Sweller & Cooper, 1985.


I was talking to Greg Ashman about this passage, and Greg made a great point. He argued that this early passage is not necessarily in conflict with Sweller’s later interviews. Why not? CLT may consider motivational factors, but it’s not what CLT is about. After all, they didn’t even measure motivation as part of this experiment. True, you need to motivate students to participate in the study, but that’s hardly the same thing as studying motivation!

In Sweller, van Merrienboer and Paas 1998, motivation comes up again (as it non infrequently does in van Merrienboer’s work).


Now, again, it’s true that this mentions motivation. And, at first, I thought that this conflicted Sweller’s later take. Sweller says in that interview that CLT assumes that students are fully motivated. If students are already fully motivated, then why talk about possible negative effects of motivation?

But this still might not conflict with Sweller’s later statements. After all, this is merely speculating on a possible way worked examples might impact motivation negatively. This does not mean that CLT is about motivation or that its study is part of CLT work.

The best support for the story I told in the essay, I think, comes from van Merrienboer & Sweller 2004 . Motivation makes it into the abstract:


“Complex learning is a lengthy process requiring learners’ motivational states and levels of expertise development into account.” Doesn’t that mean that we’re no longer just assuming high levels of motivation in CLT research? And this attention to motivation is called “a recent development in CLT.” So, surely, motivation is part of CLT’s research. No?

I think the clearest statement of motivation’s place in CLT comes in the “discussion” section of this piece:


“Four major developments in current CLT research were discussed…research to take learners’ motivation and their development of expertise during length courses or training programs into account.

This is the evidence I was confronting, and I’m sure there is more than one way to read it. My read, however, is that this is a claim that CLT research included motivational factors, and that this conflicts with Sweller’s later statements. After all, would Sweller say in 2012 that learner’s changing expertise isn’t a part of CLT research? Certainly, he wouldn’t, as the expertise-reversal effect is still an important part of CLT’s work. Motivation might have continued to be part of CLT, but Sweller changed his mind. That’s my read.

My claim was never that motivation was a core concern of CLT. But I do think that Sweller’s thinking about motivation and CLT shifted in a way that illuminates his development. It’s a shift that I think tells us something about how a major task of scientists of learning is to manage complexity, to decide what to study and what to ignore. (And how it is, to an extent, a choice.) And I do think that Sweller’s thinking about motivation helps illuminate the much more significant change in his thinking about germane load.

As always, I might have gotten this wrong. But this is why I think that there’s something interesting about motivation in CLT.

Cognitive Load Theory and Why Students Are Answer-Obsessed

It’s true: math education doesn’t give a ton of attention to Sweller and cognitive load theory. Math education researchers who are aware of Sweller are most familiar with his attack on problem-based, experiential, discovery and constructivist learning (“An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential and Inquiry-Based Learning“). As Raymond mentioned on twitter, those within math education who are likely to recognize Sweller are equally likely to dismiss him and his work.

Part of this, I think, has to do with focusing on the wrong aspects of Sweller’s work. Ask 100 people what the key idea of Sweller’s work is, and I bet 99 would say: it’s easy to overload the working memory of students. For learning, it’s important not to. So, don’t. An important but limited insight. (We’re trying not to overload anyone!)

The last 1 person out of the 100 is me. As far as math education is concerned, I think the key idea of Sweller’s work is about problem solving, not cognitive load. Here is that key idea: problem solving often forces a person into answer-getting mode, and answer-getting mode is incompatible with learning something new.

(“Answer-getting” mode also has to do with expectations that students have about math class and the sorts of activities they think are valued in mathematics. Sweller shows it has a cognitive element too.)

Sweller’s early work was with number puzzles. Participants in his studies solved the puzzle successfully, but never came to notice a fairly simple pattern which was sort of the “key” to finding any solution. Why? There were two reasons:

  1. When you’re looking for the solution to a problem, your attention is massively restricted to those things that are directly relevant to finding the solution. Lots of important details of the scenario or environment get ignored.
  2. Attention is a zero-sum game. There’s only so much that a person can notice. A person focused on finding the solution is unable to focus on much else.

(For more, read this part of my essay.)

I have found this to be absolutely true and deeply insightful. The first time the idea really hit me was during Christopher Danielson’s talk, titled “What’s the Difference Between Solving A Problem and Learning Mathematics?” There is a difference. Sweller helps us get specific about some of the reasons why.

These limitations of problem solving guide my daily classroom work. My 8th Graders are wrapping up their study of linear functions and moving on to exponential functions. Yesterday, I found myself wanting my students to start thinking about the differences between linear and exponential graphs and patterns. I took this image from David Wees’ project and displayed it on the board:

Screenshot 2016-04-13 at 6.05.12 AM

In the past, my first instinct would have been to pose the problem as quickly as possible. “What are the coordinates of point B? of point A?” I would then give my students time to think, and I would have expected some learning to have occurred.

Now I know that this could be a particularly bad way to ask my students to begin their work. They probably wouldn’t notice what I want them to notice. Instead, they’d probably go into that answer-getting mode that focuses all their resources in an unproductive way:

Screenshot 2016-04-13 at 6.08.40 AM.png

Another key insight of Sweller has to do with how to avoid ensnaring students in this unproductive struggle. One suggestion of Sweller’s is to ask less-specific questions. These nonspecific questions don’t funnel attention in the way specific questions do, and they therefore don’t overload students in quite the same way.

Sweller first described the power of nonspecific questions with regards to angle problems. Rather than asking students to find a particular angle, he asked “Calculate the value of as many variables as you can.”

Screenshot 2016-04-13 at 6.15.32 AM
Sweller, Mawar & Ward, 1983. 


With my 8th Graders, yesterday I began class with two nonspecific questions. I asked these questions so that they’d notice as much about the diagram as possible and start putting together some of the pieces about exponential relationships.

My first question: “What do you notice?” I waited for lots of hands to go up, and then I quickly called on three students. (I find it’s important to move quickly here — not so interesting to rattle through everyone’s noticings.)

My second question: “Study the diagram and find something to figure out.” I asked students to do this in their heads, alone. Then, “Talk to your partner — come up with at least two different things to figure out, then as many as you can.” (What counts as something “figured out”? We’ve done this routine many times, so my students know from experience.)

Here is an incomplete list of what my students calculated/figured out from the exponential graph:

  1. The y-coordinates are doubling
  2. The y-axis is going up by 4
  3. The slopes are changing between each pair of points
  4. The graph is non-proportional
  5. The next coordinate would be (6, 64)

If my students had mentioned, at this phase, that the coordinates of B were (2,4) we would have moved on. Since they hadn’t, and since they were saying so many smart things, I decided that this would be a great time to ask a third question:

“What are the coordinates of point B? point A?”

My students were able to answer these specific questions, but that’s hardly the point. Sweller’s research suggests that you can’t use problem-solving success as a gauge of whether kids have learned something or not.

I do think, though, that the reasons my students gave for their correct answers are revealing. Some students, in justifying their answers, mentioned that you could be sure that point A was at (0,1) because the y-coordinate seems to be 1/4 of the way up to 4. Other students then pointed out that (0, 1) fits the general pattern. What’s interesting is that this first observation — the position of point A up the axis — never came up in the first two questions I asked. That makes sense, because that way of looking at the position of point A has nothing to do with the exponential pattern.  In fact, it’s the sort of hyper-focused response that you’d only expect to hear when a very specific goal has been set by the teacher — find the coordinates of point A. Otherwise, that’s not the thing that’s worth noticing here (probably). It misses the forest for the trees in the way people do when they are focused on achieving a narrow goal.

The second response, though, showed that some of my students had started making good connections. They justified the coordinates of points A and B based on the general pattern.

All this suggests to me that while some of my students are ready for working on specific problems, many of them aren’t yet there.

Asking more nonspecific problems isn’t the only recommendation that Sweller makes, of course. He’s better known for recommending the heavy use of worked-out examples and explanations in class. We do those too, though probably not as often as Sweller would like. Still, there’s more to Sweller’s theory than worked examples.

The key idea here is that specific questions cause students to chase specific goals. Chasing a goal isn’t always helpful for learning. On the one hand, I think this makes the case for developing a specific question more slowly, asking students to notice before posing a problem. On the other, this calls for us to be more cautious and deliberate about how we use problems in our teaching, especially in the early stages of teaching a new idea.


Cognitive Load Theory is More Than Worked Examples

For the last few months, I’ve been working hard on an essay about John Sweller’s cognitive load theory. This is, by no means, a comprehensive essay about CLT. I wanted to tell a very specific story in the piece — about how Sweller came to invent his theory, how he changed it so that it could better embrace greater complexity in classroom learning, and how he ultimately restricted the boundaries of his theory to avoid this complexity.

Something that I don’t talk much about in the piece are the implications of CLT for teachers of math. CLT is highly active in arguments about how best to teach math, and many who identify as “traditionalists” cite CLT to support their views. This, in turn, leads those who identify as “progressives” to seek to discredit CLT. I have no desire to negotiate this terrain.

Discussion of CLT, I find, often focuses on one specific teaching recommendation: worked examples. See, for example, Deans for Impact’s The Science of Learning report:

Screenshot 2016-04-10 at 12.17.58 PM.png

A closer look at the work of CLT, I think, complicates the focus on worked examples in several ways.

First, there are other ways that Sweller and CLT identifies for reducing cognitive load. In particular, Sweller has found that problems with non-specific goals (i.e. more open questions) are helpful for reducing cognitive load. You don’t often hear this aspect of Sweller’s work come up in debates, but I think that’s a shame, because I think both progressives and traditionalists could support the use of these sorts of questions.

Second, there was a period of Sweller’s career when he trained his eye on learning more complex skills in classroom environments. Though he eventually moved away from this work, during this time he noted that there can be issues with worked examples, when put into practice. For example, in 1998 he wrote (with his co-authors) that “A lack of training with genuine problem-solving tasks may have negative effects on learners’ motivation.”

“A heavy use of worked examples can provide learners with stereotyped solution patterns that may inhibit the generation of new, creative solutions to problems…For this reason, goal-free problems and completion problems…may offer a good alternative to an excessive use of worked examples.”

Further, work by researchers had found that worked examples can be bested by “completion” problems, where there is thinking left for students in the task. This is the work of van Merrienboer, which I also write about in the essay. Here’s a quote about worked examples from his research:

“…students will often skip over the examples, not study them at all, or only start searching for examples that fit in with their solution when they experience serious difficulties in solving a programming problem. … [In completion problems] students are required to study the examples carefully because there is a direct, natural bond between examples and practice.”

So CLT research has at least two alternatives to worked examples for novice learning: open questions and completion tasks. And research within CLT has identified motivational or practical issues with excessive use of worked examples — these are from papers that Sweller himself wrote.

(The truth is that, depending on how complex the skill we’re trying to teach is, van Merrienboer’s line of thinking opens up a great deal of possibilities beyond worked examples. While he’s opposed to throwing novices into the deep end, well, everyone should be opposed to that. Instead, he wants to find authentic, motivating tasks that are manageable for novices. For more, see his “Ten Steps to Complex Learning.”)

I don’t think it’s surprising that “worked examples” have earned outsized attention by educators. This is the same thing that happens when educators embrace research, in general. A few years ago I read Jack Schneider’s From the Ivory Tower to the SchoolhouseThe book is about why some research catches on with teachers, while most does not. He identifies four key characteristics of research that makes the jump to practitioners:

  1. Perceived Significance: It needs to be perceived as coming from reliable, important names. (e.g. “a bunch of Harvard researchers just found that…”)
  2. Philosophical Compatibility: The research needs to be in sync with the beliefs of the educators who embrace and share it.
  3. Occupational realism: It needs to be easy to put in immediate use.
  4. Transportability: It needs to be easy to share — tweetable, even.

While Sweller doesn’t have a name-brand research pedigree that is recognizable to us in the US, worked examples otherwise fits this framework perfectly. It’s a practice that is very realistic (most teachers are already using lots of worked examples and explanations), it’s very easy to share the idea, and for those who traditionalists who have embraced it it is very much ideologically safe.

That’s not a criticism of traditionalists who embrace worked examples — it’s just a point about how research gets shared in education. “Worked examples,” like “growth mindset” or “project-based learning,” fit Schneider’s framework quite well.

What this means, though, is that you have to listen carefully to hear about anything beyond worked examples when people talk about CLT. But this emphasis on worked examples does not fairly represent Sweller or CLT. There are a host of additional ideas and techniques that his and others’ CLT research has found: open questions, completion tasks, and motivational and practical issues with worked examples in practice.

You can’t really hope to change the way people talk about anything in education, let alone research. You can hope to dig a bit deeper and find a bit of understanding beyond the noise, though. That’s what this project has been about, for me. I’m excited to share it, and I’ll continue to add some thoughts about CLT over the next few weeks.

Dissent of the Day

I said [Three Act problems] are most valuable to me before learning skills, or rather as the motivation for learning skills. I don’t expect that students will just figure everything out on their own, though. Act one helps generate the need for the tools I can offer them here in act two.

-“Teaching With Three Act Tasks: Act Two,” Dan Meyer

I’ve been thinking about it, and I think I disagree with Dan’s take here. I think there are important differences between providing instruction during, before or after a tough mathematical experience, and that instruction during a problem is often bound to be lost in the flood of ideas that a mind is awash in.

Here’s where I’m coming from. Over the past few class periods, my 4th Graders have been working on a lovely little activity. We watched a short video showing Andrew papering his cabinet with sticky notes. How many sticky notes would it take to cover the entire thing?


I showed this video, and was disappointed by the tepid response from my students. Then I asked my students to estimate the number of stickies it would take to cover the cabinet. More blahs. And then I clarified that we’re trying to figure out how many stickies would cover the entire cabinet, and my kids exploded with ideas and excitement: “Wait, can you give us time to figure this out?”

Really, really great stuff.

While walking around, I noticed some kids getting lost in their calculations. Lots of great ideas, but constantly losing the thread.

IMG_4026 Other kids, though, used diagrams to preserve their line of thought. These kids, even if they were less computationally sophisticated than other students in class, were finding relatively more success in the problem.


When I noticed this, I realized that this sort of diagramming was an important mathematical idea that I should make explicit to everyone. When pairs called me over to help them make sense of their confusing calculations, I made the suggestion: here’s a diagram, here’s how you can use it, this could help with where you’re stuck.

No dice, so I decided to pause class and say it to everyone: hey all! I noticed that the tricky thing isn’t just the calculations, but trying to keep track of what you’ve figured out and what you still need to work on. Diagrams can help, here’s a diagram, here’s how you can use it, you might try this.

As I walked around some more, I poked around to see if pairs had adopted my suggestion. No dice, still.

Bell rings, kids hand in their work, that’s that for the day.

The next day, I start class by saying, “I noticed a lot of us got stuck on the problem yesterday. We’re going to keep on working today, but here’s something that might help: here’s a diagram, etc.”


What happened? Hard to know, of course, but here’s what I’m thinking: the first time around, my kids had a million mental distractions. Some were wondering if their calculations were right. Others were just trying to get a grip on a plan of attack for the problem. Others were trying to remember where on their page they had written their current tally of the stickies on the front and back.

In other words, these kids had a lot to think about during this problem, and they weren’t really able to dedicate the brain space needed to understand a new and unfamiliar strategy.

This is also how I make sense of something I’ve noticed in my Algebra 1 class. I haven’t yet given these kids activities that explicitly address the “cover-up” method for solving equations, but I keep trying to bring it up when kids ask me for help with equations in class. The thing is, it never seems to stick.

It seems to me that if we think “just-in-time” instruction works particularly well, my kids should be able to hold onto this method a bit better than they currently do. After all, they have a clearly felt need for some new bit of math (they called me over, right?) and they are getting the instruction during their felt moment of need. Super-duper effective setting for instruction, right?

But then it doesn’t stick. And I think it’s for the same reason that my 4th Graders didn’t take up the “draw a picture” suggestion: they’re too mentally distracted to really focus on the new idea and properly learn it. After all, learning a new idea in all its proper generality can be a pretty heady bit of work. When my kids call me over for help with their equations, they’re potentially thinking about many other mathematical things — where am I in the problem? did I make a mistake by subtracting? what’s 4 divided by 6? — and often can’t focus on the strategy itself.

This, then, is a sort of dissent against the Three Act model of instruction. New mathematical ideas are not best introduced in the middle of a problem if they’re going to get the mental real estate they deserve. Students are often productively distracted by a difficult problem, and unable to focus on the strategy or tool at hand.

The thing that works better, in my experience, is following up a tough experience with a new idea or tool. This seems to me closer to ideal. The students get to spend of time struggling with a tough problem, which I think is valuable all on its own; they thoroughly understand the problem context, since they spent careful time on it; when I introduce a new idea after this experience, they are in a strong position to focus on this tricky new idea itself rather than the million other things it takes to comprehend this new tool.

As Dan Schwartz writes:

This report is based upon work supported by the National Science Foundation under REC Grant 0196238.

OK fine, but he also writes:

Instruction that allows students to generate imperfect solutions can be effective for future learning.

But instruction that comes in the heat of the moment is not looking towards the future — it’s coming during the chaotic present, a time when the student’s mind is being bombarded with many tricky ideas that are specific to a particular problem context. I don’t think that’s a great time to introduce a new idea, but tomorrow might be.