Is cognitive load theory pitifully wrong?
Philosophy, neuroscience, non sequiturs and straw men
I receive a notification from ResearchGate whenever a paper cites an article in which I am listed as one of the authors. This is always good news because it’s good to be cited, even if those articles are critiques. Nevertheless, I don’t think I have ever read a critique as strongly worded as a new paper by Minkang Kim, Christopher Duncan, Stanley Yip and Derek Sankey for the journal, Educational Philosophy and Theory. Carl Hendrick has already written about it on Twitter/X but I decided to add my thoughts.
The authors take aim at cognitive load theory and the paper is more entertaining than usual because it adopts an intemperate and polemical approach that would not usually survive peer review. For example, after attempting to summarise the main positions of cognitive load theory, the authors call them, ‘a truly bewildering set of claims that, given a moment’s thought, are educationally, philosophically, and neurobiologically questionable.’ The theory’s support for explicit teaching is ‘pitifully wrong’.
They are not fans.
The authors also darkly hint at a political agenda. They suggest there may be, ‘political factors at play,’ in the way cognitive load theory characterises learning and memory. Which unless you are an ideologue who sees politics in everything, seems like a bizarre thing to propose. They worry about the ‘pervasive endorsement’ of the theory from those such as the Australian Education Research Organisation (AERO) and Ross Fox, the director of Catholic Education Canberra and Goulburn. Cognitive load theory has informed Fox’s Catalyst program, the roll-out of which has coincided with Catholic schools this year being 13 of the top 20 best performing schools in the ACT according to standardised assessments, despite representing only 20% of schools in that territory. That sounds like a good thing but ‘pervasive’ sounds bad.
Frustratingly, Kim et al. argue against the mind being like a computer:
“In opposition to much cognitive psychology that views the brain as a computer, the human brain ‘does not solve problem in the way a computer does’, rather ‘it creates, it imagines’ (Mithen, 1996, p. 34). Steven Mithen (1996) also refers to what he calls ‘the human passion for analogy and metaphor’ (p. 53) distinctive of the modern human thinking, creative, learning brain.”
It is true that some cognitive psychology theories have drawn inspiration from computer analogies. For instance, Baddeley and Hitch’s model of working memory draws parallels between a proposed ‘central executive’ and the central processor of a computer. The problem for Kim et al. is that their paper is a criticism of cognitive load theory and cognitive load theory explicitly rejects this analogy. This is how I know the authors have limited understanding of the theory. They clearly have not read any of my output on it because I feel like I have made this point about computers many times.
The problem with the computer analogy is that it assumes that the central executive controls attention and directs working memory resources. This raises the question of what controls the central executive—in a computer it is a program written by a programmer. Is there a central executive within the central executive and so on to infinity? Cognitive load theory suggests schemas in long-term memory direct attention resources by interacting with stimuli from the environment. For example, if I say, ‘Don’t think of a white bear,’ you will probably think of a polar bear. Why? Because the stimulus from the environment—the words you read—triggered the white bear schema in your long-term memory. A key part of the theory is the effortlessness with which entire schemas full of complicated ideas such as algebra can be activated and deployed in this way, circumventing the limitations of working memory.
This shows that cognitive load theory sees the long-term memory as more dynamic than Kim et al. suppose. For example, the authors think they are arguing against cognitive load theory by stating:
“…while recognising the importance of long-term memory for exams, as teachers we don’t want our students to simply regurgitate what they have learnt, we want them to be active, imaginative, and creative learners and that means, contrary to [cognitive load theory], that working memory, properly conceived, is central to learning at school, not long-term memory. And, very importantly we want our students to pay attention. Attention is so much a part of what working memory does that it is sometimes claimed working memory and attention are two sides of the same coin.”
Setting aside the authors’ view that long-term memory is simply a kind of store that is useful only in exams, they appear to have no idea what cognitive load theory actually claims about it and are arguing against something else.
Rather than modeling the mind as a computer, cognitive load theory models it as a ‘natural information processing system’ similar to the way genetic information is acquired and iterated by the much slower process of evolution by natural selection. Kim et al. do not mention this idea at all and seem likely to be unaware of it. This is a shame because there are ideas here that would be worth testing through robust criticism.
The authors are, however, aware of the distinction made by David Geary between biologically primary knowledge and biologically secondary knowledge. They note this was introduced later into cognitive load theory when John Sweller read about Geary’s ideas and saw their value. They can’t quite make their minds up whether the fact cognitive load theory has evolved in response to evidence and to incorporate new ideas such as this is a bug or a feature.
Briefly, biologically primary knowledge is knowledge we have evolved to acquire implicitly and without explicit teaching, like learning to speak our local language as an infant. We are motivated to learn this knowledge. Biologically secondary knowledge relies on cultural innovations too recent to have been subjected to evolution. For example, writing is only around five thousand years old and so even if being able to write provided an evolutionary advantage, there has simply not been enough time to evolve a way of learning this naturally. However, writing and reading clearly sit atop a foundation of speaking and listening. They coopt these skills.
Kim et al. challenge the idea that we learn our local language implicitly and argue, as others have, that it involves a lot of teaching.
“…research shows that though learning our native language may appear effortless and rapid, it is not acquired through simple immersion – a lot of teaching is involved, including recurrent (Hebbian) repetition by parents, caregivers, and early childhood educators. It also involves the child’s effort and constant trial and error feedback (Thelen & Smith, 1994, 2006), using multimodal sensory inputs, where ‘consistency of non-linguistic regularities across time-separated but oft-repeated routines plays a central role in the initial learning of object names’ (Clerkin & Smith, 2022, p. 7).”
The problem with this argument is that we don’t take infants and explicitly teach them where to put their lips and tongue to make a ‘t’ sound*. Yes, they gain feedback through trial and error but trial and error is possibly the defining feature of implicit discovery learning. Contrast this with the equivalent process in writing. We do tend to explicitly teach children how to hold their pencils and form the letter ‘t’. What proportion do we suppose would learn how to do this without formal education?
The final main plank of Kim et al.’s argument seems to rest on neuroscience. However, it is never clear exactly how this applies. For example, the authors make the following point about the Global Neuronal Workspace Hypothesis:
“Within the Global Neuronal Workspace, located mainly but not entirely in the frontal cortex, pyramidal neurons with long axons are dynamically and profusely interconnected with attention systems, value systems, perceptual systems, motor systems, and long-term memory (Dehaene, 2014; Mashour et al., 2020). Given this anatomy, working memory is ‘conceptualized as an activity-induced temporary and flexible shift in the functionality of a (neuronal) network’ (Trübustschek et al., 2019, p. 14364). It is posited that an ‘ignition’ process provides the ‘first step leading to the entry of information into working memory’ (Mashour et al., 2020, p. 785). Ignition is conceived as a neural process ‘characterized by the sudden, coherent, and exclusive activation of a subset of workspace neurons coding for the current conscious content’, by transitioning ‘a weak sensory stimulus into the attended working memory state’ (p. 785), while the remainder of the workspace neurons are inhibited.
Apparently, we all need to know about this and its something approaching a disgrace that learned panels of experts recommending cognitive load theory don’t mention it.
It’s all very interesting, but why do educators need to know about the Global Neuronal Workspace Hypothesis? What difference does it make? What are the implications for teaching? If this is meant to somehow refute cognitive load theory then in what way does it do this? What predictions does this theory make that conflict with those of cognitive load theory? If the authors made that clear, we could perhaps devise experiments to test them and see which theory wins. I don’t think it is a coincidence that Kim et al. write very little about empirical results of this kind because a standout feature of cognitive load theory is the vast amount of randomised controlled trials that have been conducted to test its claims.
Instead, it is implied that because cognitive load theory does not talk about things the authors deem important it is, for reasons they choose not to elaborate upon, somehow lacking.
This simply does not follow. No theory has to incorporate everything. Theories are models that live and die by their ability to make accurate predictions. Cognitive load theory is not a theory of neuroscience. It is an abstract model of the mind that is entirely agnostic about what fleshy bits of grey matter constitute something like working memory.
And for good reason. Say I told you that when an item is transferred from working memory to long-term memory, an expensive scanner shows increased blood flow in an area of the brain known as Shatner’s Bassoon. How useful would that be to you as a teacher? How would you adjust your teaching? As a theory of instruction, there is a good reason why cognitive load theory would avoid neuroscience and that is because it seems to have few implications for instruction. (Jeffrey Bowers is strong on this)
There are some worthwhile criticisms to make of cognitive load theory. Kim et al. could have made a valid argument about the difficulty in disentangling biologically primary and secondary knowledge, rather than an almost self-refuting one. They could have poked more at the issue of measuring cognitive load—a favourite of sceptics—instead of mentioning just one paper on this as if it is representative of cognitive load theory research as a whole. But they did not do this.
Cognitive load theory remains one of the most empirically validated instructional theories available to us as teachers. It is practical in a way so much educational theory is not. It is not a theory of neuroscience and is all the better for this. It is not a finished theory and is still an area of active research. This is also a good thing.
Critics would do well to learn a little more about cognitive load theory before they attempt to shoot it down.
*Yes, I understand that we sometimes do this with children who are struggling with oral language acquisition, but we don’t need to do this for the vast majority of children.
Another fun point about Kim et al, while they profess a belief in some interest in learning they clearly have very little actual motivation to learn something.
How difficult would it be to discuss their ideas with someone researching cognitive load theory?
They clearly haven’t bothered. Perhaps even university researchers are not as intrinsically motivated to learn as they would like to imagine.
These seems like the worst part of the whole article:
"while recognising the importance of long-term memory for exams, as teachers we don’t want our students to simply regurgitate what they have learnt, we want them to be active, imaginative, and creative learners and that means, contrary to CLT, that working memory, properly conceived, is central to learning at school, not long-term memory. And, very importantly we want our students to pay attention. Attention is so much a part of what working memory does that it is sometimes claimed working memory and attention are two sides of the same coin."
Putting aside that this is a non-sequitur for their preferred theory, doesn't every proponent of CLT fully agree that working memory and attention are 'central to learning at school'? Specifically because they are the only ways to get anything into long-term memory? Am I missing something or are they just incapable of imagining anything other than a ridiculous strawman of CLT?