The curse of conceptual understanding
A response to Bill McCallum
For the reasonable price of just $5 AUD per month or $50 per year, you can be a full subscriber to Filling the Pail with complete access to my extensive archive on a range of topics, as well as access to the unabridged version of my weekly Curios posts.
My recent, provocatively titled post, Conceptual understanding is a myth, has drawn the attention of Bill McCallum, a mathematician and one of the two founders of the Illustrative Mathematics curriculum that is popular in the United States. He has written a post on his Substack where he takes issue with mine.
I have been aware of Illustrative Mathematics for some time and the approach to teaching that it represents:
“Students discover, understand, and internalize key math concepts and apply their learning to various real-world problems and scenarios, simultaneously building procedural fluency and conceptual understanding.”
If this description is accurate, then Illustrative Mathematics is a flawed approach that does not align with what we now know about how people learn academic content. Still, I can find much common ground with McCallum’s arguments and he has sought to find common ground with mine. The problem is that this does not lead in the direction McCallum assumes.
Automated schemas
To begin, consider the following equation that I discussed in my PhD research. If you have seen me give a presentation, you will be familiar with it:
Consider what a person needs to know to be able to solve this equation. Firstly, they must know what the number symbols represent. At an early age, they might start by counting tokens and associating three tokens with the word, ‘three’. Then, when they begin to read, they would learn to map the sound of the word ‘three’ onto the symbol ‘3’. None of this is trivial and if you imagine it is, then you should go spend some time in an early years setting.
Even if you are unconvinced by cognitive load theory, the finding that we can only process about four items at a time in working memory is robust across cognitive science. But what is an item? In the equation above, we can see five symbols. If that surprises you because you read ‘18’ as one item then, again, you should visit an elementary school and see how painstakingly such knowledge is assembled.
In addition to numbers, a student solving this equation would need to know what the equality sign means and to infer the presence of a missing multiplication sign between the 3 and the x. Worse still, the items all interact with each other, so that whatever we do to one of them, it has implications for the others. This means there are more than six items present, with the total depending on how you count these relationships. This is well outside the working memory capacity of four items.
And yet, for most people reading this post, within a fraction of a second of seeing the above equation, they will simply have known that x is equal to six. They will have achieved this not by using working memory resources, but by activating a complex schema for problems like these that is stored in long-term memory. These schemas are so powerful, they can answer questions before we think to ask them. Yet they represent the slow accumulation of large amounts of integrated knowledge. In this case, for example, an algebra schema must seamlessly integrate with the schema for the six-times-table. Note, you could not achieve this feat without automatically being able to retrieve 3 x 6 = 18. Those who suggest it is a waste of time to memorise such maths facts are plainly wrong.
What is understanding?
The sheer ease with which individuals who have studied algebra can retrieve this answer leads to a common problem—the curse of knowledge. We struggle to empathise with those who lack these schemas because it seems so obvious and trivial to us. And that sense of triviality leads to us dismissing the hard work that goes into developing these schemas and causes us to prioritise something more glamorous and grandiose. This is the point where eminent mathematics professors scoff at the idea of children completing exercises with a pencil and paper and instead, wave their hands and argue that real mathematicians look for patterns, make conjectures, test cases, try failed approaches, justify arguments, and communicate ideas to others.
You can imagine the mathematical equivalent of Robin Williams’ character in Dead Poets Society urging his young charges to tear pages out of their textbooks and instead go off into the woods and invent new Noncommutative Geometry together.
The ease with which relative experts draw on schemas to solve problems—schemas that are far more complete than the developing schemas of novices—is probably what gives us the subjective experience that we call ‘understanding’. There are no confusing bits to stumble over. We can circuit the problem in its entirety. These schemas do not simply contain procedures. It is important to stress that a good teacher will explain and re-explain why different methods work and what is going on under the hood. These add to the developing schemas.
A good example of this knowledge—and one that has been heavily researched—is the idea that the equality sign means ‘the same on both sides’ and not ‘put the answer here’. However, this is relatively straightforward to learn and recite in comparison to learning how to apply this knowledge to solve linear equations. And it is not qualitatively different from other forms of knowledge. It is simply knowledge of a different and simpler kind.
Converging on transfer
In Conceptual understanding is a myth, I show three examples of the kinds of questions researchers use to test for conceptual knowledge. These are taken from a paper by Michael D’Erchie and colleagues. Again, it is worth noting that researchers do not call it ‘understanding’ and that is a sign there is something troublesome about the concept. I also refer to the Crooks and Alibali (2014) paper that surveys many such attempts to directly measure conceptual knowledge and demonstrates that the D’Erchie examples are fairly typical.
McCallum calls the D’Erchie examples ‘embarrassing’ and states that I am right to ‘pillory’ them. I don’t agree that I pilloried them, I merely pointed out that they don’t assess conceptual knowledge as claimed, but I am glad McCallum and I can agree on this latter point.
The inability to directly measure conceptual understanding is a significant point for McCallum to concede. So-called ‘productive struggle’ has been given a free ride in the media and is a field of research that is drawn upon to justify the adoption of programs like Illustrative Mathematics. When The New York Times published a starry-eyed article on mathematical struggle by Jenny Anderson in 2022, she wrote of the work of Manu Kapur who researches this field. My own research challenges Kapur’s findings, and so I offered to respond—an offer that was declined.
The startling fact is that many productive struggle experiments are poorly controlled, with the supposed explicit teaching comparison conditions often eccentric in design. Those studies that claim to show an advantage for productive struggle in developing conceptual knowledge, measure this directly in the same way Crooks and Alibali charted. If we are going to throw out these measures then we must logically throw out these claims. That’s a big call from McCallum.
And yet McCallum and I both converge on transfer as a better measure, but of what?
What is in a name?
McCallum is keen to suggest transfer measures conceptual understanding. The circle is squared. I am tempted by this. It is a better definition. It is not straightforward to measure transfer, because we need to know whether study participants have encountered ideas before, but it is a better measure than questions that directly attempt to measure conceptual knowledge. That’s why I measured transfer in my PhD research.
But at this point I pause. Occam’s razor suggests that if we can explain transfer with knowledge—as I believe we can—then we should avoid invoking additional ideas such as conceptual understanding.
And transfer is notoriously difficult to achieve. Suffering from the curse of knowledge, teachers underestimate how hard it is for relative novices to apply their new learning to even slightly different situations and so we are tempted to cycle through these new contexts too quickly, as kids—always the most disadvantaged ones—fall off the back of the bus.
Defined by transfer, ‘conceptual understanding’ would be hard and slow to accrete. If everyone recognised this then it might not be a problem.
The problem is the fact that conceptual understanding—conceptual anything—is such a value-loaded term. People assume it is better, superior, higher, groovier than supposedly rote memorised and regurgitated procedural knowledge—boo, hiss! If we declare that transfer is evidence of conceptual understanding then we will initiate a transfer gold rush. Students will be asked to perform transfer. Educators will skip the difficult, arduous and complex task of slowly building suitably structured knowledge—that’s just rote!—and will, instead, prematurely ask students to solve problems of a kind they haven’t seen before. The pursuit of conceptual understanding will encourage poor, ineffective teaching.
Although, to be fair, there is plenty of that around already.




Hi Greg. Thanks for the post, really enjoyed reading.
I wanted to point out that I feel that most of this discussion and research uses secondary examples, and I'm seeing an impact in the primary space. At my last school the learning specialist insisted worked examples go up in every classroom, but everything he showed us referenced upper primary secondary. The Prep and Year 1 teachers felt really silly doing this. This piece, which I love, uses algebra too. I think these insights would pay off enormously if we could reference the early years more.
In the early year, schemas are less developed and less interconnected, and children have far less (relevant knowledge) to pull into working memory (I think?). So then I feel like the disconnect is bigger because designers have very little reference for what is an element and what is a skill's interactivity and they are consistently putting too much into a learning experience. So the overload starts on day 1 and because everything compounds, a small thing done right at the start is worth far more than excellent instruction later. The primary offerings and influence doesn't seem to match this disparity.
Fractions are a good example. Referencing the denominator as a unit (unitising) is its own skill, and I've never once seen a school write it as a learning intention in isolation. I run an assessment where I show a circle split into quarters with one shaded and ask what fraction it's divided into. It's rare for more than a few to say quarters. Most lack unitising in this specific context in their prior knowledge they can name a part of a whole, but only in that one context. And then we wonder why 2 thirds and 2 thirds turn into 4 sixths.
That's one skill. There are so many that are being consistently overlooked and are invisible to designers. And then we spend years working out how to retro-fix everything.
Hi Greg, we agree on more than your article suggests. Once you grant that schemas carry the "why" and not just procedures, and that transfer is the better measure, I'm not sure our remaining disagreement is really about whether the thing exists. And although your subtitle is "A response to Bill McCallum," many of the things you respond to are things other people have said that I disagree with, e.g. that conceptual is superior to procedural, or that kids don't need to learn their math facts. I really appreciated your careful analysis of 3x = 18.
On your Occam's razor point: calling knowledge conceptual or procedural doesn't multiply entities—it classifies one. Mathematics has concepts and procedures, so there is conceptual knowledge and procedural knowledge.
I agree that Conceptual Understanding™ the brand is oversold, and your worry about a gold rush if it is attached to transfer is warranted. I am more interested in conceptual understanding the thing, and I'd happily call it whatever you like if everyone would agree to use the same word, but they won't, so refining the language we have is the more realistic task. And there is a thing to name: what lets knowledge transfer is the way it's organized—connected, built around the main ideas—and that structure is what I mean by the term.
By the way, if the hand-waving professor sending children into the woods is meant for me, I'd point instead to the post I linked to in my article, "Max discovers a theorem," (https://mathematicalmusings.substack.com/p/max-discovers-a-theorem) about a fourth grader who states a general theorem about subtraction in the middle of an ordinary arithmetic lesson. Looking for patterns and making conjectures isn't a glamorous alternative to pencil-and-paper arithmetic—it lives inside it.