Reading Research: The Case of Mrs. Oublier


A Revolution in One Classroom: The Case of Mrs. Oublier (link) is an oft-cited piece of education research by David K. Cohen. It’s a case study of just a single teacher (Mrs. O) and her math teaching, at a time (the ’80s) when California lawmakers sought to radically transform math teaching in the state.

Mrs. Oublier is a pseudonym, oublier meaning “forgotten” in French. She’s earned this pseudonym for thinking her teaching had undergone a revolution, though in the eyes of Cohen she hardly changed any of the important stuff. I guess the point is that she oublier-ed to make these changes? Or that reformers didn’t help her make them?

Anyway, a lot of the fun of the piece is seeing the funhouse-mirror ways in which Mrs. O interprets those cutting-edge ideas about manipulatives, small group work, and estimation. And Cohen has serious things to say about why policy-makers never quite reached Mrs. O in the way they intended to, though I might question some of his conclusions.

Another thing that’s interesting about this piece is what it’s not: a representative sample from the teaching population. It’s the story of one teacher. Cohen tells us that Mrs. O’s story matters, but why should we believe him?

There’s no denying that Cohen tells a good story. But isn’t research supposed to be more than a good story?


Mrs. O has been teaching second grade math for four years. The kids like her; colleagues like her; administrators think she’s doing a great job.

As a student, Mrs. O hadn’t liked math much, and she didn’t do too well in school. When she got to college, though, she started doing better. What changed? “I found that if I just didn’t ask so many why’s about things that it all started fitting into place,” she tells Cohen. So, that’s not a great start.

And yet, Mrs. O tells Cohen that she’s interested in helping her students really understand math. She also tells him that she’s experienced a real revolution in her teaching, a departure from the traditional, worksheet+drill methods she used when she began. On the basis of his observations, Cohen is strongly inclined to agree with her on this.

In the centerpiece episode, Cohen catches Oublier in the midst of a fairly ridiculous lesson. Oublier wants to teach her students about place value (so far so good). To do this, she wants to introduce another base system (debatable, but not necessarily a disaster). So Oublier gives each kid a cup of beans and a half-white/half-blue board.

Mrs. O had “place value boards” given to each student. She held her board up [eight by eleven, roughly, one half blue and the other white], and said: “We call this a place value board. What do you notice about it?”

Cathy Jones, who turned out to be a steady infielder on Mrs. O’s team, said: “There’s a smiling face at the top.”

On a personal note, I have been teaching 3rd and 4th Graders for four years and the idea of giving kids those little cups of beans gives me minor terrors. What if the cups spills? How early do you have to get to school to set up the beans? What if a kid eats a bean?

Anyway, after Mrs. O has ensured that all the kids noticed that their boards are half-white and half-blue, she starts the game. The game is supposed to be about grouping and regrouping in place value systems, but it’s really entirely about beans. She calls out a command, and the kids add a bean. At no time does she connect the beans to numbers.

According to Cohen, this was no accident, as Mrs. O wasn’t really a fan of making numbers explicit in her activities:

This was a crucial point in the lesson. The class was moving from what might be regarded as a concrete representation of addition with regrouping, to a similar representation of subtraction with regrouping. Yet she did not comment on or explain this reversal of direction. It would have been an obvious moment for some such comment or discussion, at least if one saw the articulation of ideas as part of understanding mathematics. Mrs. O did not teach as though she took that view. Hers seemed to be an activity-based approach: It was as though she thought that all the important ideas were implicit, and better that way.

Oublier is a huge believer in manipulatives — in fact, the transition from worksheets to manipulatives seems to be a big part of what her “revolution” entailed. For Mrs. O, kids learn through the physical manipulation of the objects. As in, learning is the direct result of touching beans:

Why did Mrs. O teach in this fashion? In an interview following the lesson I asked her what she thought the children learned from the exercise. She said that it helped them to understand what goes on in addition and subtraction with regrouping. Manipulating the materials really helps kids to understand math, she said. Mrs. O seemed quite convinced that these physical experiences caused learning, that mathematical knowledge arose from the activities.

Oublier tells Cohen that she relies heavily on a textbook, Mathematics Their Way, and that this text was the major source of some of her new ideas about physical activities and teaching math. From poking around, it looks like the whole text has been posted online, including the lesson that Mrs. O was caught teaching. Here’s what the bean-counting activity looks like in the text:

Screenshot 2017-06-11 at 8.55.38 PM.png

OK, now the next page of that activity:

Screenshot 2017-06-11 at 9.11.52 PM.png

But you won’t believe what’s on the page after that:

Screenshot 2017-06-11 at 9.12.43 PM.png

This is sort of getting repetitive so I’ll just skip ahead five pages:

Screenshot 2017-06-11 at 9.13.37 PM.png

Cohen comes down pretty hard on this curriculum, and on Mrs. O for using it:

Math Their Way fairly oozes the belief that physical representations are much more real than symbols. This fascinating idea is a recent mathematical mutation of the belief, at least as old as Rousseau, Pestalozzi, and James Fenimore Cooper, that experience is a better teacher than mere books. For experience is vivid, vital, and immediate, whereas books are all abstract ideas and dead formulations.

I’ve focused on the manipulative episode, but that’s just part of her teaching that’s detailed in the piece. According to Cohen, Oublier generally seems to adopt the exterior of cutting-edge math teaching while sort of missing their points. She asks kids to estimate, but doesn’t give them chances to think or share ideas. She uses manipulatives, but doesn’t really ask kids to think much with them. She puts kids into small groups, but basically uses this as a classroom management structure. She avoids numbers and abstraction wherever possible.

This was certainly not what California’s math reformers had in mind.


The point, for Cohen, is that California’s math reformers let Mrs. O down. But how, exactly?

I found myself needing more context for the California reforms than Cohen provides. Fortunately, the journal issue in which Mrs. O originally appeared was entirely dedicated to the California math reforms. (In fact, every piece in that issue was a different in-depth case study like Mrs. O.)

Cohen actually leads off the issue with a helpful summary of the aims and methods of the 1985 math reforms (link). At their center was a document, the California Math Framework. The Framework called for a transformation of math teaching away from rote memorization and drill, and towards a focus on conceptual understanding, teaching kids to communicate about math, problem solve, work in groups, make sense of math, etc.

So far, nothing new. Reform groups like NCTM have been pumping out these documents for a century.

What was new was the muscle California chose to employ. The state education office said that they would only reimburse districts for textbooks that met the standards of the Framework. And then they actually followed through by rejecting all the texts that publishers initially submitted. Eventually, the state got what they wanted and created an approved list of textbooks for districts to choose from.

(As Alan Schoenfeld notes in his Math Wars piece, California — along with Texas and New York — determine what gets published nationally because of the size of their markets. The publishers basically design their books for the big states, and the rest of the country gets dragged along. So California’s reform muscle had national implications.)

This was half the plan. The other half was to change the state tests for kids so that they also reflected the vision of the Framework. The idea was that if textbooks and tests were in place, teachers would come around all on their own.

I missed this the first few times, but this is why Cohen dwells so much on Oublier’s textbook choice. Oublier’s favored Math Their Way text was not an accepted California text, and Oublier’s district had adopted something else. Oublier likes Math Their Way, though, so she just uses that in her classroom instead. None of her superiors seems to mind either.

In other words, that entire “change teaching by making a list of textbooks” plan was sort of stupid. It failed to account for the ability of teachers to get other textbooks if they wanted to.

The fundamental assumption of the policy seemed to be that teachers need permission, or perhaps incentives, to teach in new ways. As Cohen points out — over and over — this is not the case. Teaching in fundamentally different ways implies believing that you should teach differently as well as knowing how to do so.

It’s pretty simple, actually: if you want to change teaching, you can’t ignore the teachers.


Even as Cohen critiques the California reforms, he still seemed to me pretty cheery about the potential for policy to impact reform.

First, he really does seem to give a lot of agency to math textbooks. He keeps on talking about the influence of the Math Their Way book on Mrs. O. On the one hand, the book’s influence on her comes at the expense of the Framework’s reach. At the same time, if a textbook can really have such a strong impact on a teacher, then the premise of the California reforms has been upheld. If you’re a reformer reading Cohen, I imagine that your mind starts wandering: imagine what would’ve happened if we could’ve gotten the right book in her hands!

Beyond Cohen’s implicit optimism about textbook reform, he also wonders aloud about the possibility that a bit of incentive-engineering could have steered someone like Mrs. O towards better teaching:

“The only apparent rewards were those that she might create for herself, or that her students might offer. Nor could I detect any penalties for non-improvement, offered either by the state or her school district.”

These two sources of optimism, when put in context, seemed a bit dated to me. Cohen published this article in 1990, just after NCTM published its Principles and Standards for School Mathematics in 1989. This was, in many ways, a higher-profile go at California’s Framework, and (surprisingly to all involved) it took off, becoming a blockbuster for NCTM.

In the 90s, NSF would fund the development of new math texts that were aligned with the NCTM standards. My sense is that they didn’t live up to the expectations of the textbook-optimists. The texts were just texts, tools that teachers could use well or poorly depending on their understanding of math and of teaching.

It turns out: textbooks can’t transform teachers.

(Textbooks, it also turns out, can become highly visible targets of controversy, and nearly all use of the reform textbooks became contentious in the 90s. So that seems like it needs to be part of the textbook-reform calculus.)

Cohen seems to think that Math Their Way transformed Mrs. O, but he also thinks that she didn’t really revolutionize her teaching. The changes were cosmetic. And there’s a huge difficulty determining how the text impacted because of the plain fact that she chose this curriculum. Presumably, she chose it because she was disposed to. It fit with her understanding of math and of teaching. It didn’t fundamentally challenge her, and I see no reason to think that a text has any such power of a teacher, even when imposed.

Cohen’s other musing — about incentives — has echoes in No Child Left Behind and performance pay reforms. These reforms have also failed to live up to the dreams of the reformers, as all reforms do, and teaching chugs along, mostly as it has.

At times, it seemed to me that Cohen believes that the fundamental problem, for Mrs. O, is that her views on the nature of math remain unchanged:

…however much mathematics she knew, Mrs. O knew it as a fixed body of truths, rather than as a particular way of framing and solving problems. Questioning, arguing, and explaining seemed quite foreign to her knowledge of this subject. Her assignment, she seemed to think, was to somehow make the fixed truths accessible to her students.

I’m not particularly sympathetic to this critique. Math, among other things, is a fixed body of truths (theorems, facts, relationships) that we ought to help students know.

But forget that for a moment. Cohen sometimes seems to think that this isn’t just a problem for Mrs. O, but the root problem. If we could just help Oublier see that math isn’t quite as she thinks it is — that it’s dynamic, a source of puzzles, it’s about thinking and not just about knowing — then her teaching really would undergo a real revolution.

This seems to be where we are, right now, in math education reform. We’re not trying to save the world with NSF-funded textbooks, and we’re not hoping to incentivise great teaching. We believe, like Cohen, that the fundamental problem is one of learning, and that the fundamental problem is a fundamental problem, some ambitiously big thing that, if we can help teachers attain, the rest of their teaching will fall into place.

Right now, one version of the “fundamental problem” is productive struggle. NCTM has included this in their latest set of reform standards, the Principles to Actions standards. And if you’re in Baltimore this July, you can attend a three-day summer institute focused on productive struggle. The workshop promises to show how productive struggle is tied to every dimension of effective math instruction, from planning to feedback to wider advocacy.

I don’t think I believe in this sort of reform either. Cohen keeps drawing comparisons in this piece between teacher and student learning — both are challenging, he says, both take time. And that’s true. But imagine if we treated students like teachers. In other words, imagine if instead of teaching math to kids we had a workshop a few times a year where we tried to fundamentally alter their conceptions of math, and then sort of hoped that the rest of their math learning would just fall into place.

I know the comparison isn’t exactly direct, or fair, but I don’t believe that any knowledge can be altered by changing one fundamental element. Knowledge isn’t really structured that way, it seems to me. It’s not built on a foundation. To alter teaching you’d have to alter it broadly, not centrally. And broad change just can’t happen in a three-day workshop.

The final source of optimism that Cohen raises is that maybe Mrs. O represents progress for math reform. Though she hasn’t seemed to internalize the message of the reform, this sort of messy progress is what progress actually looks like.

I have no way of knowing if that’s true, but it certainly strikes me as possible. I haven’t read more recent work of Cohen’s. I wonder if, looking back on the last 30 years of reform, he’s still as optimistic.


Hey, wait a second! This is just a single case study. We were swept along in this gripping tale (aptly summarized) and assumed she represented some larger trend, but that’s just the illusion of focus. Cohen’s fooled us, then, hasn’t he? Maybe Mrs. O means nothing at all. (Or, at least, nothing beyond her own case.)

There are two things that temper this sort of skepticism. First, the journal that published Mrs. O also published four other case studies in the same issue (open version). So in addition to the case of Mrs. O, you also get the case of Carol Turner, Cathy Swift, Joe Scott, and Mark Black.

(Unclear if the other pseudonyms are also supposed to be deeply meaningful. Mark Black, because policymakers treat him like a black box. Cathy Swift, because the reforms were too fast! The other two stump me. Maybe they’re anagrams? Joe Scott = COOT JEST.)

Five case studies are only a bit better than one, but these other four cases present a lot of the same mixed-success-at-best themes as Mrs. O’s case. That helps.

The other thing that tempers skepticism about Mrs. O’s relevance is that Cohen actually also identified the “forgotten teacher” problem in a very different piece of research.

That other piece is called Instructional Policy and Classroom Performance: The Mathematics Reform in CaliforniaThis time around, Cohen and his team do pretty much the opposite of “sit in the back of a classroom and watch.” They survey 1,000 California elementary teachers. They ask teachers to rate how frequently they employ various instructional activities in class. Hey, they ask, wouldn’t it be nice if all these teacher responses really pointed to two types of teachers? We could call them “traditional” and “reform-friendly”…

Err, did I say “traditional”? I meant “conventional”:

Screenshot 2017-06-12 at 10.51.01 PM.png


Anyway, Cohen’s group also asked teachers what professional learning opportunities they had, in relation to the math reforms. (I love that ‘Marilyn Burns’ is an option.)

Screenshot 2017-06-12 at 10.52.30 PM.png


What they find basically supports Cohen’s take in his Mrs. O piece — reform is possible, but only when it focuses on professional development that targets teacher learning:

Our results suggest that one may expect such links when teachers’ opportunities to learn are:

  • grounded in the curriculum that students study;
  • connected to several elements of instruction (for example, not only
    curriculum but also assessment);
  • and extended in time.

Such opportunities are quite unusual in American education, for professional
development rarely has been grounded either in the academic content of schooling or in knowledge of students’ performance. That is probably why so few studies of
professional development report connections with teachers’ practice, and why so many studies of instructional policy report weak implementation: teachers’ work as learners was not tied to the academic content of their work with students.

Some people love the Mrs. O piece, but hated the sort of study that we previously read here, the one about teacher-centered instruction for first graders. First, because they rely on teacher responses to survey questions, and how much can you really learn from that? Second, because the statistical work can hide researcher assumptions that then become tricky to dig out. Third, because with scale comes quality control issues. You really no longer know what you’re dealing with.

To which, we might ask, why did Cohen produce exactly this kind of study when it came to evaluate the success of California’s reforms?

I talk to just as many people, though, who hold the complete opposite view. To them, something like the Mrs. O study is useless, as it doesn’t help us identify the causal forces at work. Maybe the reform failed Mrs. O, but compared to what? There are no controls, and without some sort of random assignment to a treatment can we really be sure that a focus on teacher-learning would make the difference Cohen said it would?

Is it too soft of me to say that both critiques are right?

It’s not my job to study teaching, but it sure seems hard. Every research approach has trade-offs. The way I see things, it’s best to use multiple, incompatible approaches to study the same things in teaching from wildly different perspectives. Why? Because of how it’s possible to take wildly different incompatible perspectives on teaching.

At one point, Cohen points out that Mrs. Oublier seemed comfortable living in contradiction:

Elements in her teaching that seemed contradictory to an observer therefore
seemed entirely consistent to her, and could be handled with little trouble.

But there really isn’t anything strange here at all. Everyone is willing to live with some contradictions in their lives. Contradictions can be unlivable, but they can also be productive — in teaching, in life, but also in research. Intellectually incompatible perspectives can be desirable.

Anyway, enough about all this. What should we read next?


Feedbackless Feedback


Not all my geometry students bombed the trig quiz. Some students knew exactly what they were doing:

Screenshot 2017-05-26 at 3.12.57 PM

A lot of my students, however, multiplied the tangent ratio by the height of their triangle:

Screenshot 2017-05-26 at 3.19.05 PM.png

In essence, it’s a corresponding parts mistake — the ’20’ corresponds to the ‘0.574’. The situation calls for division.

Half my class made this mistake on the quiz. What to do?


Pretty much everyone agrees that feedback is important for learning, but pretty much nobody is sure what effective feedback looks like. Sure, you can find articles that promise 5 Research-Based Tips for great feedback, but there’s less there than meets the eye. You get guidelines like ‘be as specific as possible,’ which is the sort of goldilocks non-advice that education seems deeply committed to providing. Other advice is too vague to serve as anything but a gentle reminder of what we already know: ‘present feedback carefully,’ etc. You’ve heard this from me before.

As far as I can tell, this vagueness and confusion accurately reflects the state of research on feedback. The best, most current review of  feedback research (Valerie Schute’s) begins by observing that psychologists have been studying this stuff for over 50 years. And yet: “Within this large body of feedback research, there are many conflicting findings and no consistent pattern of results.”

Should feedback be immediate or delayed? Should you give lots of info, or not very much at all? Written or oral? Hints or explanations? If you’re hoping for guidance, you won’t find it here. (And let’s not forget that the vast majority of this research takes place in environments that are quite different from where we teach.)

Here’s how bad things are: Dylan Wilam, the guy who wrote the book on formative assessment, has suggested that the entire concept of feedback might be unhelpful in education.

It’s not looking like I’m going to get any clarity from research on what to do with this trig quiz.


I’m usually the guy in the room who says that reductionist models are bad. I like messy models of reality. I get annoyed by overly-simplistic ideas about what science is or does. I don’t like simple models of teaching — it’s all about discovery — because I rarely find that things are simple. Messy, messy, (Messi!), messy.

Here’s the deal, though: a reductionist model of learning has been really clarifying for me.

The most helpful things I’ve read about feedback have been coldly reductive. Feedback doesn’t cause learning . Paying attention, thinking about new things — that leads to learning. Feedback either gets someone to think about something valuable, or it does nothing at all. (Meaning: it’s effecting either motivation or attention.)

Dylan Wiliam was helpful for me here too. He writes,

“If I had to reduce all of the research on feedback into one simple overarching idea, at least for academic subjects in school, it would be this: feedback should cause thinking.”

When is a reductive theory helpful, and when is it bad to reduce complexity? I wonder if reductive theories are maybe especially useful in teaching because the work has so much surface-level stuff to keep track of: the planning, the meetings, all those names. It’s hard to hold on to any sort of guideline during the flurry of a teaching day. Simple, powerful guidelines (heuristics?) might be especially useful to us.

Maybe, if the research on feedback was less of a random assortment of inconsistent results it would be possible to scrap together a non-reductive theory of it.

Anyway this is getting pretty far afield. What happened to those trig students?


I’m a believer that the easiest way to understand why something is wrong is usually to understand why something else is right. (It’s another of the little overly-reductive theories I use in my teaching.)

The natural thing to do, I felt, would be to mark my students’ papers and offer some sort of explanation — written, verbal, whatever — about why what they did was incorrect, why they should have done 20/tan(30) rather than 20*tan(30). This seems to me the most feedbacky feedback possible.

But would that help kids learn how to accurately solve this problem? And would it get them to think about the difference between cases that call for each of these oh-so-similar calculations? I didn’t think it would.

So I didn’t bother marking their quizzes, at least right away. Instead I made a little example-based activity. I assigned the activity to my students in class the next day.


I’m not saying ‘here’s this great resource that you can use.’ This is an incredibly sloppy version of what I’m trying to describe — count the typos, if you can. And the explanation in my example is kind of…mushy. Could’ve been better.

What excites me is that this activity is replacing what was for me a far worse activity. Handing back these quizzes focuses their attention completely on what they did and what they could done to get the question right. There’s a time for that too, but this wasn’t a time for tinkering, it was a time for thinking about an important distinction between two different problem types. This activity focused attention (more or less) where it belonged.

So I think, for now, this is what feedback comes down to. Trying to figure out, as specifically as possible, what kids could learn, and then trying to figure out how to help them learn it.

It can be a whole-class activity; it can be an explanation; it can be practice; it can be an example; it can be a new lesson. It doesn’t need to be a comment. It doesn’t need to be personalized for every student. It just needs to do that one thing, the only thing feedback ever can do, which is help kids think about something.

The term ‘feedback’ comes with some unhelpful associations — comments, personalization, a conversation. It’s best, I think, to ignore these associations. Sometimes, it’s helpful to ignore complexity.

Reading Research: What Sort of Teaching Helps Struggling First Graders The Most?

I always get conflicted about reading an isolated study. I know I’m going to read it poorly. There will be lots of terms I don’t know; I won’t get the context of the results. I’m assured of misreading.

On the other side of the ledger, though, is curiosity, and the fun that comes from trying to puzzle these sort of things out. (The other carrot is insight. You never know when insight will hit.)

So, when I saw Heidi talk about this piece on twitter, I thought it would be fun to give it a closer read. It’s mathematically interesting, and much of it is obscure to me. Turns out that the piece is openly available, so you can play along at home. So, let’s take a closer look.


The stakes of this study are both high and crushingly low. Back in 2014 when this was published, the paper caught some press that picked up on its ‘Math Wars’ angle. For example, you have NPR‘s summary of the research:

Math teachers will often try to get creative with their lesson plans if their students are struggling to grasp concepts. But in “Which Instructional Practices Most Help First-Grade Students With and Without Mathematics Difficulties?” the researchers found that plain, old-fashioned practice and drills — directed by the teacher — were far more effective than “creative” methods such as music, math toys and student-directed learning.

Pushes all your teachery buttons, right?

But if the stakes seem high, the paper is also easy to disbelieve, if you don’t like the results.

Evidence about teaching comes in a lot of different forms. Sometimes, it comes from an experiment; y’all (randomly chosen people) try doing this, everyone else do that, and we see what happens. Other times we skip the ‘random’ part and find reasonable groups to compare (a ‘quasi-experiment‘). Still other times we don’t try for statistically valid comparisons between groups, and instead a team of researchers will look very, very closely at teaching in a methodologically rich and cautious way.

And sometimes we take a big pile of data and poke at it with a stick. That’s what the authors of this study set out to do.

I don’t mean to be dismissive of the paper. I’m writing about it because I think it’s worth writing about. But I also know that lots of us in education use research as a bludgeon. This leads to educators reading research with two questions in mind: (a) Can I bludgeon someone with this research? (b) How can I avoid getting bludgeoned by this research?

That’s why I’m taking pains to lower the stakes. This paper isn’t a crisis or a boon for anyone. It’s just the story of how a bunch of people analyzed a bunch of interesting data.

Freed of the responsibility of figuring out if this study threatens us or not, let’s muck around and see what we find.


The researchers lead off with a nifty bit of statistical work called factor analysis. It’s an analytical move that, as I read more about, I find both supremely cool and metaphysically questionable.

You might have heard of socioeconomic status. Socioeconomic status is supposed to explain a lot about the world we live in. But what is socioeconomic status?

You can’t directly measure someone’s socioeconomic status. It’s a latent variable, one responsible for a myriad other observable variables, such as parental income, occupational prestige, the number of books you lying around your parents’ house, and so on.

None of these observables, on their own, can explain much of the variance in student academic performance. If your parents have a lot of books at home, that’s just it: your parents have a lot of books. That doesn’t make you a measurably better student.

Here’s the way factor analysis works, in short. You get a long list of responses to a number of questions, or a long list of measurements. I don’t know, maybe there are 100 variables you’re looking at. And you wonder (or program a computer to wonder) whether these can be explained by some smaller set of latent variables. You see if some of your 100 variables tend to vary as a group, e.g. when income goes up by a bit, does educational attainment tend to rise too? You do this for all your variables, and hopefully you’re able to identify just a few latent variables that stand behind your big list. This makes the rest of your analysis a lot easier; much better to compare 3 variables than 100.

That’s what we do for socioeconomic status. That’s also what the authors of this paper do for instructional techniques teachers use with First Graders..

I’m new to all this, so please let me know if I’m messing any of this up, but it sure seems to me tough to figure out what exactly these latent variables are. One possibility is that all the little things that vary together — the parental income, the educational attainment, etc. — all contribute to academic outcomes, but just a little bit. Any one of them would be statistically irrelevant, but together, they have oomph.

This would be fine, I guess, but then why bother grouping them into some other latent variable? Wouldn’t we be better off saying that a bunch of little things can add up to something significant?

The other possibility is that socioeconomic status is some real, other thing, and all those other measurable variables are just pointing to this big, actual cause of academic success. What this ‘other thing’ actually is, though, remains up in the air.

(In searching for other people who worried about this, I came across a piece from History and Philosophy of Psychology Bulletin called ‘Four Queries About Factor Reality.’ Leading line: ‘When I first learned about factor analysis, there were four methodological questions that troubled me. They still do.’)

So, that’s the first piece of statistical wizardry in this paper. Keep reading: there’s more!


Back to First Graders. The authors of this paper didn’t collect this data; the Department of Education, through the National Center for Education Statistics, ran the survey.

The NCES study was immense. It’s longitudinal, so we’re following the same group of students over many years. I don’t really know the details, but they’re aiming for a nationally representative sample of participants in the study. We’re talking over ten-thousand students; their parents; thousands of teachers; they measured kids’ height, for crying out loud. It’s an awe-inspiring dataset, or at least it seems that way to me.

As part of the survey, they ask First Grade teachers to answer questions about their math teaching. First, 19 instructional activities…

Screenshot 2017-05-16 at 8.55.12 PM

…and then, 29 mathematical skills.

Screenshot 2017-05-16 at 8.55.48 PM

Now, we can start seeing the outlines of a research plan. Teachers tell you how they teach; we have info about how well these kids performed in math in Kindergarten and in First Grade; let’s find out how the teaching impacts the learning.

Sounds, good, except HOLY COW look at all these variables. 19 instructional techniques and 29 skills. That’s a lot of items.

I think you know what’s coming next…



So we do this factor analysis (beep bop boop boop) and it turns out that, yes, indeed some of the variables vary together, suggesting that there are some latent, unmeasured factors that we can study instead of all 48 of these items.

Some good news: the instructional techniques only got grouped with other instructional techniques, and skills got groups with skills. (It would be a bit weird if teachers who teach math through music focused more on place value, or something.)

I’m more interested in the instructional factors, so I’ll focus on the way these 19 instructional techniques got analytically grouped:

Screenshot 2017-05-16 at 9.08.53 PM.png

The factor loadings, as far as I understand, can be interpreted as correlation coefficients, i.e. higher means a tighter fit with the latent variable. (I don’t yet understand Cronbach’s Alpha or what it signifies. For me, that’ll have to wait.)

Some of these loadings seem pretty impressive. If a teacher says they frequently give worksheets, yeah, it sure seems like they also talk about frequently running routine drills. Ditto with ‘movement to learn math’ and ‘music to learn math.’

But here’s something I find interesting about all this. The factor analysis tells you what responses to this survey tended to vary together, and it helps you identify four groups of covarying instructional techniques. But — and this is the part I find so important — the RESEARCHERS DECIDE WHAT TO CALL THEM.

The first group of instructional techniques all focus on practicing solving problems: students practice on worksheets, or from textbooks, or drill, or do math on a chalkboard. The researchers name this latent variable ‘teacher-directed instruction.’

The second group of covarying techniques are: mixed ability group work, work on a problem with several solutions, solving a real life math problem, explaining stuff, and running peer tutoring activities. The researchers name this latent variable ‘student-centered instruction.’

I want to ask the same questions that I asked about socioeconomic status above. What is student-centered instruction? Is it just a little bit of group work, a little bit of real life math and peer tutoring, all mushed up and bundled together for convenience’s sake? Or is it some other thing, some style of instruction that these measurable variables are pointing us towards?

The researchers take pains to argue that it’s the latter. Student-centered activities, they say, ‘provide students with opportunities to be actively involved in the process of generating mathematical knowledge.’ That’s what they’re identifying with all these measurable things.

I’m unconvinced, though. We’re supposed to believe that these six techniques, though they vary together, are really a coherent style of teaching, in disguise. But there seems to me a gap between the techniques that teachers reported on and the style of teaching they describe as ‘student-centered.’ How do we know that these markers are indicators of that style?

Which leads me to think that they’re just six techniques that teachers often happen to use together. They go together, but I’m not sure the techniques stand for much more than what they are.

Eventually — I promise, we’re getting there — the researchers are going to find that teachers who emphasize the first set of activities help their weakest students more than teachers emphasizing the second set. And, eventually, NPR is going to pick up this study and run with it.

If the researchers decide to call the first group ‘individual math practice’ and the second ‘group work and problem solving’ then the headline news is “WEAKEST STUDENTS BENEFIT FROM INDIVIDUAL PRACTICE.” Instead, the researchers went for ‘teacher-directed’ and ‘student-centered’ and the headlines were “TEACHERS CODDLING CHILDREN; RUINING FUTURE.”

I’m not saying it’s the wrong choice. I’m saying it’s a choice.


Let’s skip to the end. Teacher-directed activities helped the weakest math students (MD = math difficulties) more than student-centered activities.

Screenshot 2017-05-16 at 9.39.26 PM.png

The researchers note that the effect sizes are small. Actually, they seem a bit embarrassed by this and argue that their results are conservative, and the real gains of teacher-directed instruction might be higher. Whatever. (Freddie deBoer reminds us that effect sizes in education tend to be modest, anyway. We can do less than we think we can.)

Also ineffective for learning to solve math problems: movement and music, calculating the answers instead of figuring them out, and ‘manipulatives.’ (The researchers call all of these ‘student-centered.’)

There’s one bit of cheating in the discussion, I think. The researchers found another interesting thing from the teacher survey data. When a teacher has a lot of students with math difficulty in a class, they are more likely to do activities involving calculators and with movement/music then they otherwise might be:

Screenshot 2017-05-16 at 9.48.00 PM

You might recall that these activities aren’t particularly effective math practice, and so they don’t lead to kids getting much better at solving problems.

By the time you get to the discussion of the results, though, here’s what they’re calling this: “the increasing reliance on non-teacher-directed instruction by first grade teachers when their classes include higher percentages of students with MD.”

Naming, man.

This got picked up by headlines, but I think the thing to check out is that the ‘student-directed’ category did not correlate with percentage of struggling math students in a class. That doesn’t sound to me like non-teacher-directed techniques get relied on when teachers have more weak math students in their classes.

The headline news for this study was “TEACHERS RELY ON INEFFECTIVE METHODS WHEN THE GOING GETS ROUGH.” But the headline probably should have been “KIDS DON’T LEARN TO ADD FROM USING CALCULATORS OR SINGING.”


Otherwise, though, I believe the results of this study pretty unambiguously.

Some people on Twitter worried about using a test with young children, but that doesn’t bother me so much. There are a lot of things that a well-designed test can’t measure that I care about, but it certainly measures some of the things I care about.

Big studies like this are not going to be subtle. You’re not going to get a picture into the most effective classrooms for struggling students. You’re not going to get details about what, precisely, it is that is ineffective about ineffective teaching. We’re not going to get nuance.

Then again, it’s not like education is a particularly nuanced place. There are plenty of people out there who take the stage to provide ridiculously simple slogans, and I think it’s helpful to take the slogans at their word.

Meaning: to the extent that your slogan is ‘fewer worksheets, more group work!’, that slogan is not supported by this evidence. Ditto with ‘less drill, more real life math!’

(I don’t have links to people providing these slogans, but that’s partly because scrolling through conference hashtags gives me indigestion.)

And, look, is it really so shocking that students with math difficulties benefit from classes that include proportionally more individual math practice?

No, or at least based on my experience it shouldn’t be. But the thing that the headlines get wrong is that this sort of teaching is anything simple. It’s hard to find the right sort of practice for students. It’s also hard to find classroom structures that give strong and struggling students valuable practice to work on at the same time. It’s hard to vary practice formats, hard to keep it interesting. Hard to make sure kids are making progress during practice. All of this is craft.

My takeaway from this study is that struggling students need more time to practice their skills. If you had to blindly choose a classroom that emphasized practice or real-life math for such a student, you might want to choose practice.

But I know from classroom teaching that there’s nothing simple about helping kids practice. It takes creativity, listening, and a lot of careful planning. Once we get past some of the idealistic sloganeering, I’m pretty sure most of us know this. So let’s talk about that: the ways we help kids practice their skills in ways that keep everybody in the room thinking, engaged, and that don’t make children feel stupid or that math hates them.

But as long as we trash-talk teacher-directed work and practice, I think we’ll need pieces like this as a correction.

Geometry Labs + Which One Doesn’t Belong

I love Henri Picciotto’s Geometry Labs text. I was preparing my geometry class for his inscribed angles activity, and saw this:


Thanks to the Which One Doesn’t Belong people (and Christopher’s lovely book), I’m no longer able to look at sets of four things. It’s ruined me. I’m always deciding which of them is the odd one out.

Since there are subtle differences between the inscribed angle cases, I decided to cover up the words and ask my students which of the four diagrams was the weird one.

image (10).jpg

This drew attention to the location of the centers, the location of radii, and the presence of isosceles/scalene triangles. (I know it’s May, but any chance to get kids to practice using this vocabulary is time well spent.)

This week in 4th Grade I’ve also been using Geometry Labs‘s chapter on tilings. (Sort of a random topic, but random topics are fun. Plus, I need to figure out where we stand on multiplication/division before one last push in our last weeks together.)

There I was, trying to figure out how to attune kids to the subtle classification differences between these two square tilings…


…and while, admittedly, I clearly had “Which One Doesn’t Belong” on my mind, it seemed a pretty good fit for my need here too. I took out some pattern blocks and snapped a picture:

image (9).jpg

There were lots of interesting aspects of this discussion, though my favorite had to do with whether the top-left and bottom-right tilings were different. I forget if we’ve talked about congruence yet in this class, but there were a lot of good ideas about whether tilting the tessellation made it a different tiling.

Not much else to share here, but I guess I’d say that I do this a lot. I don’t rewrite texts or worksheets or whatever very often. More often I add little activities before or after, to make sure kids can understand the activity, or to react to their thinking. That’s good for me (because I don’t have time to remake everything) and good for kids too (I write crappy curriculum).

What is it that I do?

I read a lot of teacher blogs these days.

(Incidentally, I turned MTBoS_Blogbot into an RSS feed, which was my reason for begging Lusto to make it in the first place.)

Anyway, I read a lot of teacher blogs. I see your beautiful activities, clever games and meaningful conversations. I wish I had an ounce of the teacherly creativity that Sarah Carter has, but really I don’t. It’s not what I do.

So, what exactly is it that I do?

In 8th Grade we’re going to study exponential functions. Class began with a lovely Desmos activity. They worked with randomly assigned partners.

Screenshot 2017-05-01 at 6.58.28 PM.png

After thinking through these questions, I thought kids could begin learning about equations for exponential functions, and towards this it would be helpful to contrast linear table/equations with exponential ones.

In years past, I would have aimed to elicit these ideas out of a conversation. I’ve lost faith in this move, though. While it’s nice to get kids to share ideas, their explanations are often muddy and don’t do much for kids who don’t already see the point. (Just because a kid can say something doesn’t mean that they should.) This, at least, is what I suspect.

Better, I’ve come to believe, to follow-up an activity like this one with briefly and directly presenting students with the new idea. I worry more about visual presentation than I used to. Here is what I planned to write on the board, from my planning notebook:


I put this on the board, so that it would be ready after the kids finished the Desmos activity: what could the equations of each of these relationships be? boom, here they are:

image (1).jpg
Spot the differences between this and my plan! They were all on purpose.

During planning I hadn’t fully thought through what I was going to ask kids to do with this visual. At first, I stumbled. I gave an explanation along with the visual, but I got vibes that kids weren’t really thinking carefully about the equations yet. So I asked them to talk to their partners for a minute to make sure they both understood where the exponential equation came from.

You can tell when a question like that falls flat. There wasn’t that pleasant hum of hard-thinking in the classroom, and the conversations I overheard were superficial.

Remembering the way Algebra by Example (via CLT) uses example/problem pairs, I quickly put a new question on the board. I posted an exponentially growing table and asked students to find an equation that could fit this relationship.

There we were! This question got that nice hum of thinking going.

The equation wasn’t there, originally, duh.

While eavesdropping on kids, I heard that L had a correct equation. I thought it would be good to ask L to present her response, as she isn’t one of the “regular customers.”

Her explanation, I thought, gave a great glimpse of how learning works. She shared her equation but immediately doubted it — she wasn’t sure if it worked for (0,5). After some encouragement from classmates she realized that it would work. Turns out that her thought process went like this: 10, 100, 1000, that’s powers of 10 and this looks a lot like that. But how can I get those 5s to show up in there too…ah! The example involved multiplication so this one can too.

(Of course, she didn’t say this in so many words. After class I complimented her on the explanation and she put herself down: I don’t know how to explain things. I told her that learning new stuff is like that — your mind outpaces your mouth — but I thought I had understood her, and confirmed that I got her process.)

With the example properly studied, I went on to another activity. Following my text, the next twist was to bring up compound interest. I worried, though, that my students would hardly understand the compound interest scenario well enough to learn something from attacking a particular problem.

While thinking about this during planning, I thought about Brian’s numberless word problems. (My understanding of numberless problems is, in turn, influenced by my understanding of goal-free problems in CLT.)

I took the example problem from my text ($600 investment, 7% interest/year, how much money do you have in 10 years?), erased the numbers and put the variables on the board.


Then, I asked kids (again with the partners) to come up with some numbers, and a question. If you come up with a question, try to answer it. (A kid asking But I can’t think of a question is why this activity was worth it. And with some more thought, they could.)

I collected their work from this numberless interest problem, and I have it in front of me now. I see some interesting things I didn’t catch during class. Like the kid who asked ‘How much $ does someone lose from interest after 5 years?’ (And why would an 8th Grader know what interest is, anyway?) Or the kids who thought a 10% interest rate would take $100 to $180 over 8 years.

No indications from this work that anyone uses multiplication by 1.10 or 1.08 or whatever to find interest. Not surprising, but I had forgotten that this would be a big deal for this group.

For a moment I’m tempted to give my class feedback on their work…but then I remember that I can also just design a short whole-group learning activity instead, so why bother with the written feedback at all.

I’m not exactly sure what ideas in the student work would be good to pick up on. I should probably advance their ability to use decimals to talk about percent increase, but then again there was also that kid who wasn’t sure what interest was.

My mind goes to mental math. I could create a string of problems that use the new, exponential structure with decimals:

  • 600 x 1.5
  • 600 x 1.5 x 1.5
  • 1000 x 1.5^3
  • 50% interest on a $200 investment

That’s awfully sloppy, but it’s just a first draft.

Or maybe the way to go is a Connecting Representations activity that asks kids to match exponential expressions with interest word problems.

I’m not sure, but all this is definitely a good example of what I do. It’s what I’m learning how to do better in teaching, at the moment. It’s not fancy or flashy, and no one’s lining up to give me 20k for it, but it’s definitely representative of where I am now.

I’m not sure at all how to generalize or describe what it is this is an example of, though. Is it the particular flow of the 45-minute session that I’m learning to manage? Or is it the particular activity structures that I happen to have gathered in my repertoire?

None of those are satisfying answers. Maybe, instead, this is just an example of me basically doing what I think I should be doing. My reading is piling up, and I’m getting some coherent ideas about how learning and teaching can work. This lesson is a good example of how those principles more-or-less look in action. It might not be right (and it sure isn’t at the upper limits of what math class can be) but I’ve got a decent reason for most of the decisions I made in this session.

I think what I have to share, then, is how what I’m reading connects to how I’m teaching. This episode is an example of that.

Michael Disagrees With Tweets

In what may or may not begin a new series on this blog, I will now (politely and lovingly, I hope) disagree with a tweet.

On the internet, nobody knows if you can manage a classroom or not. Maybe twitter can solve this. Currently, you get a blue “verified account” check next to your name if you did something cool to deserve it, like being rich or popular. Maybe we could have something like that in education. (I’m a verified red apple educator!)

Until then, there’s no way to tell online who can or can’t run a classroom.

I suppose it’s true that someone who has never run a classroom probably can’t, and these people shouldn’t try to tell you about managing behavior. But take Tom. I don’t know Tom. I have no idea what sort of a teacher he was when he was in the classroom. How would Tom’s standard apply to Tom? How can I know if Tom can run a classroom or not?

This is always how it is with teaching. We don’t have access to each other’s classrooms, so we can only rely on each other’s descriptions of teaching. That’s true for everybody, teachers and non-teachers alike.

This matters a lot more to ex-teachers than to teachers, I think. The relationship between teachers and non-teachers is complicated. You might think that teachers are just suspicious of non-teachers, and that’s true, but we also care the most about what some non-teachers say. Someone on twitter once pointed out to me that classroom teachers are generally suspicious of non-teachers but very trusting of a few chosen non-teacher experts who have credibility. This struck me as totally true.

As a consequence of all this, some non-teachers find it helpful to try to hold on to the status of in-the-know teacher even though they have left the classroom.

To which I say, it’s not worth it. Don’t bother. The kindness that teachers offer other teachers isn’t because of a presumption that this other teacher gets it, or that they have useful information to offer that non-teachers don’t. Rather, I’d say, it’s just that: kindness. I would posit that it’s not that teachers are more trusting of others in the classroom, just that we try to be nice to each other, because the job is hard and knowledge is tentative and we all know how little status we each have. Once you leave the classroom your status has just bumped up in the education world, and that extra-kindness can no longer protect you from the skepticism of other teachers.

Which is fine, because you can still influence teachers in the one way you ever could: by describing what it is that you think will work.

Mental Math Gone Wrong?

Maybe this was a good idea, maybe not.

I was trying to figure out how to start class. My 8th Graders have been studying the Pythagorean Theorem. I knew I wanted to start with some mental math* but wasn’t sure how to start.

This desire to often begin class with some mental math is, at this point, sort of an instinct. On the one hand you need instincts when you’re planning class, because otherwise everything takes forever as you get sucked into a recursive vortex of decision-making. But is it a good instinct? I don’t know how to think about that.

The way I teach the Pythagorean Theorem, being able to mentally chunk a tilted square into triangles and squares (rather than trying to count each square or triangle) is an important part of the skill. It helps kids quickly see the area of squares, freeing up their attention to focus on the relationship between the squares built on the sides of triangles.

Yesterday, we explicitly talked about the Pythagorean Theorem in terms of the area of squares built on a right triangle’s sides. The plan for class was for kids to get better at using it in all sorts of different problems.

So, I decided to build a string of squares built on the hypotenuses of right triangles, and ask kids to find the square-areas in sequence, building up to a generalization. We start: What’s this square’s area? Put a thumb up (please don’t wave a hand in someone else’s face) when you’ve decided. What is the area? How do you know? OK here’s your next tilted square, etc.

image (6).jpg
Sloppy picture so you know it’s the real deal

Here’s where my teaching got sort of mushy. The really important skill isn’t finding the area of tilted squares. What kids really are going to want to know, later on, is the Pythagorean relationship between right triangle sides and areas.

So here’s the question: did this string of problems draw attention to the important math?

Turns out, it didn’t. Kids made the generalization in the last step (as far as I could tell from eavesdropping on their conversations) entirely on the basis of the earlier examples. And those areas were found by chunking up the area. In other words, this was arithmetic-generalization. They didn’t use the Pythagorean relationship.

What were my options, when I realized this? I was happy that kids were able to mentally dissect these tilted-squares, but was a bit disappointed that they didn’t start noticing Pythagorus here. I lost a chance to help them try out using that relationship. Since the rest of the class was designed to help them practice this theorem, it became important for me to prompt their memory of it at the start.

What can you do, right? Impossible to predict kids perfectly. Except that I could have prompted the Pythagorean relationship after the first example didn’t go the way I expected it to. I could have said — after I made sure that students were not going to — that this tilted square’s area could be found using Pythagorus, and then I’m sure I would have gotten more kids to play out this relationship in their minds for the rest of the string.

That’s not what happened, though, so I weakly finished the string with my own personal observation that, hey, we could’ve used PT here. The kids shrugged. OK. I pulled out a quick problem that did prompt kids to use the Pythagorean Theorem, but by then I’m not sure I had everybody on board. We finish, and kids are getting jittery. We’ve used up* whatever whole-group learning time we were going to get at the start of class, so I started problem-solving time.

That’s definitely how I see things right now, at least. Again, I don’t know if this instinct is a good one.

Class went OK after that. But I’m still trying to figure out whether I did this right. Should I have designed the initial string differently? Should I have reacted differently?

(And, the sort of meta-question I have is what exactly it would mean for me — or any teacher — to know how to do this better. Where does that knowledge come from? Can it be shared?)

Two Ideas

Everybody should do more mental math — mental algebra, geometry, calculus, topology, whatever — at pretty much every level of math.

Whenever you’re tempted to write comments on a student’s paper, just pick out one common issue from the class’ work and start class responding to that, somehow.



Stephen King taught high school English for two years:

I wasn’t having much success with my own writing, either. Horror, science fiction, and crime stories in the men’s magazines were being replaced by increasingly graphic tales of sex. That was part of the trouble, but not all of it. The bigger deal was that, for the first time in my life, writing was hard. The problem was the teaching. I liked me coworkers and loved the kids — even the Beavis and Butt-Head types in Living with English could be interesting — but by most Friday afternoons I felt as if I’d spent the week with jumper cables clamped to my brain. If I ever came close to despairing about my future as a writer, it was then.

“Jumper cables clamped to my brain.” I totally believe and experience this. It’s an obvious fact of my life…but is it true? Why would it be?

My work isn’t as intellectually involved as e.g. being a grad student, researcher, journalist, etc. Teachers don’t have to regularly learn new facts or disciplines. We don’t make our living as readers, writers or thinkers.

We don’t even work especially long hours. Yes, yes, endless grading. But even taking grading into account, it’s unclear to me how many extra hours we actually put in. I know, personally, that I tend to way over-estimate my out-of-work hours. I tend to count all sorts of quasi-work into the bucket, like all that time that I’m thinking about looking at student work but instead I’m writing a blog post on a Sunday night.

Grading and planning are like (to get back to King) little evil vampire children that rap on the window while we’re catching a break after a long day. Let us in, they say, you need to.

(I just finished reading Salem’s Lot which stars Matt Burke, veteran teacher, which is how I ended up down this road.)

The Bureau of Labor Services surveyed teachers and instead of asking how many hours they worked in a week, asked them how many they worked yesterday. (This includes out of school work.) The stunning results: responses amount to just under a forty-hour work week. We even, on average in this survey, work less on the weekends than a comparison groups that contains health care professionals, business and financial operations professionals, architects and engineers, community and social services professionals, managers.

I’m inclined to believe the more modest hour-estimates of the BLS, as they fit what I see in myself and colleagues at the different places I’ve taught. (I’d also say that, in the places I’ve taught, there are outlier teachers who just go nuts with work. If the BLS stuff doesn’t fit your picture, you might be such a teacher.)

But I’m also inclined to think that King’s brain wouldn’t be depleted if he were a journalist or a researcher or a bond-trader who worked till ten every night.

(Speaking of work hours and exhaustion: I read King’s Under the Dome while working fifteen-hour work days as a delivery truck driver in my summer after graduation. I’ve never been as desperate for a book as I was while working that job. That book took over my life while I was working — my wife [then girlfriend] still teases me about it. The job involved driving around campus, picking up and dropping off recycled furniture, which sort of wore me out. I was physically exhausted but mentally starving and I’d collapse in bed with that book for some of the most satisfying hours of reading I’ve experienced in my life.)

What could there be about teaching that makes it mentally exhausting? Or is this just standard working-adult exhaustion?

I can’t think of anything, which makes me wonder if Steve and I are making this up.

One thing I know about teaching, though, is that you rarely know if you’ve taught well. And that fits with what I know about writing — that it’s lonely work, done in a quiet space over long-periods of time. Unlike teaching, there are certain stone-cold ways of knowing that you’ve done good work — the acclaim of readers — but the lead-up to that moment (if it ever arrives) is the ultimate marshmallow test.

Maybe this is it: teaching exhausts us in a way that kills our willingness to write.