Making good progress? Putting formative assessment to the test
In their widely acclaimed 1998 book Inside the Black Box, British educationalists Dylan Wiliam and Paul Black popularised the notion of ‘formative assessment’, highlighting the importance of how teachers provide feedback and use it to adapt their teaching to better meet pupils’ needs. This type of responsive teaching has been found to be effective in improving student learning but, while many schools already use formative assessment strategies, they often find it challenging to implement effectively.
These challenges were behind the Education Endowment Foundation’s (EEF) decision to fund a team at SSAT (the Schools, Students and Teachers network) to deliver a whole-school intervention Embedding Formative Assessment (EFA). 70 schools delivered the programme in the form of a two-year professional development pack, which contained everything they needed to run 18 monthly workshops, alongside support from a team of designated practitioners.
EFA: Putting formative assessment to the test
The results of the evaluation by my colleagues and I at the National Institute of Economic and Social Research are published today. They show a positive impact on GCSE results for Attainment 8, the students’ average achievement across eight subjects. Pupils taught in schools who ran the professional development programme made an average of two additional months’ progress in Attainment 8, compared to those in control schools.
While these results are positive and suggest there may be benefits in implementing this programme in schools, our broader evaluation highlights two important issues that need to be considered when attributing results to interventions: first, that interventions can be delivered differently in participating schools; and second, that interventions may have a greater impact over the longer term.
Implementation varied across schools, so how do we explain impact?
Our qualitative research showed schools often made substantial adaptions to key components of the programme’s design. For example, while schools committed to a certain agenda and length of the monthly workshops, in reality, they often adapted the structure to meet the needs of the school. Also, while schools committed to organise monthly peer observations where pairs of teachers observed each other’s lessons, in practice this happened on a much less frequent basis, often due to timetabling constraints.
From a school perspective, these changes were seen as necessary to effectively implement the programme. From an evaluator’s perspective, the differences in implementation across schools raise obvious questions: To what extent did schools implement the actual programme, and did they implement the same programme? And if they didn’t, what caused the pupils to improve their GCSE results? While we cannot be clear about the answers, our qualitative findings allow us to speculate.
What was consistent about the programme across schools?
What was consistent across treatment schools was the use of monthly workshops called Teaching Learning Communities (TLCs). While these varied in their structure and agendas they were in all cases collaborative groups used by teachers to discuss and share their successes and failures with assessment practices. TLC groups were cross-curricular, and teachers repeatedly emphasised that learning from colleagues in other departments and across subjects had been invaluable in the process of sharing good practices. For many schools and teachers this was a new and very welcome way of working with colleagues, which was likely an important factor in the programme’s success. What we can’t answer is whether schools could have done even better if they had followed the TLC structures more rigidly, or if they had carried out other core practices such as peer observations on a more regular basis. The formative assessment strategies themselves were often well-known. So EFA’s value to teachers may have been the opportunity to have a sustained focus over a longer period to enhance what they were already doing, and revisit good practices that had gone off their radar.
Are we understating the impact?
The observed impact was statistically significant at the 10% level; in other words, there is less than a 10% probability that the impact we observed occurred by chance. There are also other parts of our statistical analysis which adds to our confidence in the conclusions. But it is possible that the results would be more robust had we observed the pupils over a longer time period. At the time of the interviews, halfway through the intervention, teachers said it had already had some positive impact on teaching quality and feedback, and that the intervention had increased pupils’ engagement and enjoyment of lessons. But teachers frequently doubted that this would already have translated into better GCSE results after two years. Instead they believed that younger pupils, with a longer term exposure to feedback techniques, may benefit more.
While this is highly tentative, it points to the need for longer-term follow-up of pupils who have experienced such interventions. The EEF is currently looking at the longer term outcomes of some of its early projects, including Growth Mindset, where one of the conclusions of the process evaluation was that it is likely that the invention would bear fruit in terms of pupil behaviour and performance over time.
In the meantime, this evaluation does suggest that EFA can improve the ways in which teachers provide feedback to pupils so that they learn more effectively and achieve better results. It also provides further evidence for the Teaching and Learning Toolkit, compiled by the Sutton Trust and the Education Endowment Foundation, which indicates that high-quality feedback can be one of the most cost-effective ways of boosting pupils attainment.