The best any politician can hope for is that when they introduce a policy with a goal in mind, that goal is subsequently met and with few serious unintended consequences. Politicians may then claim that their policy caused this goal to be met, but that will always be a matter for debate. Unless run as a randomised controlled trial, it is impossible to prove this beyond reasonable doubt.
This is one reason why the UK is potentially so interesting from an education policy perspective. Its constituent nations have embarked upon quite different approaches to education, including early reading instruction. Unfortunately, Wales and Scotland have not taken part in recent rounds of the PIRLS reading comprehension assessment for students aged 9-10 and so a useful comparison is missing.
Nevertheless, when we look at PIRLS for England over time, it seems to be improving. This is particularly true if we note the sampling issues in PIRLS 2001 for England, which may have led to an overestimate of the mean score. Depending on taste, we may choose to include or exclude the 2021 results, as they were affected by the pandemic and by different school systems engaging with the assessments at different times as a result.
Again, this doesn’t prove anything conclusive, but it is suggestive of an improvement in England that correlates with a number of key policy initiatives such as the introduction of a phonics screening check and the promotion of early literacy programs that meet a set of government benchmarks based on systematic synthetic phonics. It’s just a pity we cannot see the results for Scotland and Wales on the same assessment—Scotland participated for the first two cycles but then withdrew, ostensibly due to the cost. They plan to return in 2026.
If England genuinely has improved, this is significant, because the reforms in England only targeted word reading and not language comprehension—unless we include the larger, slow burning project of building a knowledge-rich curriculum—and reading comprehension as measured by PIRLS is a product of both word reading and language comprehension. Those of us working in schools know how difficult this combination of factors makes the task of moving the dial on reading comprehension assessments when compared to, say, mathematics or grammar tasks.
Dr Jonathan Solity is not persuaded by this narrative. At MultiLit’s recent Advancing Effective Education Summit, Solity delivered a keynote outlining the recent history of early literacy teaching in England and borrowing from the name of Emily Hanford’s successful podcast series, argued that teachers in England have been ‘sold a story’ about the effectiveness of systematic synthetic phonics and—his particular bugbear—decodable books.
Systematic synthetic phonics involves teaching students the relationships between letters and the sounds they represent in a planned sequence and then blending those sounds to read words. The evidence for systematic phonics instruction is strong. However, there is ongoing debate among reading researchers over whether the synthetic approach—building words from sounds—is more effective than analytic approaches, which focus on analysing sounds within whole words. Essentially, nobody has done, or intends to do, a full randomised controlled trial comparing the two, so we are at an impasse.
Decodable books are those that restrict the letter-sound relationships in the text to ones the students have learned so far in their reading program. I have never understood the decidedly moralistic concern about these books and the strong attachment some have to ‘real’ books with uncontrolled letter-sound relationships. I remember my own daughters delighting in being able to read decodables without help. I still read Alison Lester and Julia Donaldson books with them—and so did their teachers—so I don’t feel they missed out. However, I can see that a prolonged reliance on decodables alone, well after the first year of school, would be undesirable and I wonder whether that happens in some places and this is what is driving dissent.
Solity’s presentation was noticeably barbed. It included a section where he listed how many times Nick Gibb, the minister most associated with England’s reforms, had been sacked and reappointed. This did not appear to be relevant and came across as mean-spirited rather than humorous in the way Solity perhaps intended. Maybe I am influenced by the fact I have met Gibb a few times and I like him.
In making the case that England’s reforms have made little difference, Solity referenced a report by the Education Policy Institute. I won’t elaborate much on that here because Andrew Old is in the process of taking that report apart on his Substack.
On the phonics screening check, Solity made what struck me as a particularly odd claim. The check takes place in Year 1 in England—the same-age equivalent of our Foundation year in Australia. Students who do not meet the 32-mark threshold in Year 1 are required to sit the assessment again in Year 2. Solity claimed that we should expect students who have ‘passed’ this assessment in either Year 1 or Year 2 to have the same level of reading fluency, and yet those who meet the threshold in Year 2 tend to have lower reading fluency than those who do so in Year 1.
I find this entirely unsurprising. The phonics check is an assessment of word reading, not fluency, and so those who are able to read words earlier have more time, opportunity and aptitude for the practice that leads to fluency. Treating the check as if it is an assessment of fluency to demonstrate it is not an assessment of fluency seems like an overly elaborate straw man.
Solity also drew on evidence from England’s SAT assessments. These are standardised examinations that students take at the end of Years 2 and 6. He showed the results initially increased but then flatlined from 2016 to 2024 at around a 75% ‘pass’ rate—by ‘pass’, I think he was referring to those who meet the expected standard but Solity seems to have a preference for framing assessments in terms of pass rates. What Solity did not mention is that the expected standard was raised in 2016 and so the results he presented were not being measured on the same scale and flat results after 2016 actually represented an initial increase.
Solity is a director of Optima Psychology, which has its own initial reading programme. I had not heard of this program before. What I took from Solity’s explanation was that it teaches children sight words and, although it teaches letter-sound relationships explicitly, it teaches significantly fewer than other programs. Nevertheless, Optima enables children to access almost as many words as these other programs.
I take that on face value and will suspend judgement until I read an effectiveness study.
Solity finished by being quite rude about his home country. He referenced various Australian cultural icons and asked, ‘What could you possibly learn from England?’ Again, I suppose this was intended to be amusing but grated on me.
Curiously, Dr Rhona Stainthorpe OBE was scheduled to give the keynote talk directly after Solity and her subject was pretty much the same history and the same data set. Yet, Stainthorpe drew a very different set of conclusions. In contrast to Solity’s suggestion that teachers have always taught synthetic phonics, Stainthorpe told an anecdote about a teacher who, around 25 years ago, hid all her phonics materials in a cupboard when the school inspectors visited.
Stainthorpe’s take on the PIRLS data was thorough and she pointed out that while England’s mean had been rising, perhaps more importantly, the less advanced students had been making the greatest gains and so this trend had been accompanied by a closing of the gap. This is a key factor for Australians to consider because our gap is large. Stainthorpe also highlighted the 2016 shift in how SATs were assessed.
Once we leave the world of controlled experiments and enter the world of policy, advocates of different approaches can plausibly argue for quite contradictory narratives. It was interesting and unusual to see two such arguments presented one after the other. However, I still think the conventional view—that England has improved in reading, and that this is at least partly a result of government reforms—is compelling.
I had my popcorn out. It was fun to be surprised by this quirk of programming.
It is difficult, indeed, to interpret the data from a time series. And, once one "leave[s] the world of controlled experiments and enter[s] the world of policy," the arguments can be predicated on very different beliefs. So, with the data you showed, one can argue just about anything: "The world is flat!" "Oh, no it's not. It's round!"
The methods of single-subject (or "-case," if one prefers) research allow one to do so, but they require many more measurement occasions and rigorously controlled comparisons between two levels of an independent variable. In fact, those are experimental methods, too.
BTW, the US report of the "National Reading Panel" included a meta-analytic comparison of synthetic and analytic phonics. Both were more effective than the control condition (essentially, "whole language"), but the average effect size for synthetic phonics was not statistically significantly greater than for analytic phonics. I should publish my slides about this comparison...sigh.