Confidence issues in educational assessment
On the first post-COVID A-Level results day, we bring you an except from the introduction to Mary Richardson's excellent open access book Rebuilding Public Confidence in Educational Assessment, exploring the nature of trust in assessment discourses.
In June 2013, I was conducting research in Finland. It was part of a longitudinal study with six European partner universities, and we had spent a week together writing and planning. On the final day, the weather was uncharacteristically hot for the Arctic Circle and, given the option of an indoor university tour or a trip to Santa’s village, we all chose the latter.
The village includes shops and, of course, Santa’s Post Office, where, after posting some cards, I wandered into the post room. I struck up a conversation with an ‘elf’ about the types of request they get and she showed me the files of letters, pulling out that year’s collection from England. One letter, handwritten on pink notepaper (in typically girlish writing – very rounded, with hearts instead of dots above the letter ‘i’), caught my eye. It said:
Dear Santa, What I’d like for Christmas is to get 10 A stars in my GCSEs. If I fail, I will let everyone down – they think I can do it. I try really hard at school but don’t always get the grades I want. Please help Santa. Love, xxx
I was struck by the fact that a child of 15 or 16 years old (the age when GCSE examinations are sat in England) was writing to a mythical figure for help and by the innate desperation of the request. This letter suggests that the pressure is too much, the expected level of achievement is wrong, and its presence is causing such anxiety that it led to this desperate cry for help.
Throughout this book, I use many examples from my own context in England, but I also include examples from international contexts, to demonstrate that we are facing a global crisis in education. The examples and focus for the issues in educational assessment are based on ‘discourses’ – the many ways in which we communicate and share ideas, and how we understand and make sense of the world. The letter to Santa not only reflects a discourse of high expectations (a desire to achieve top grades), it also reveals an opposing discourse framed by doom, of concern about letting people down or not being good enough.
It is important to understand that discourses are not the ‘truth’; rather, they are narratives constructed by individuals or groups to try to characterise what is meant in a particular situation. What makes discourses problematic is when they become an accepted norm or an ideal that skews how people see and understand the world around them. In educational settings, this is definitely an issue. The theory of discourses in education is explained further in Chapter 1.
Globally, the emphasis on comparative achievement in educational assessment has become more prominent since the 1990s (Unterhalter, 2019). This has radically changed our perceptions about the aims and purpose of education, and has consequently impacted on how we view educational assessment. Essentially, assessment is characterised by a received culture of competition, leading to a belief that the grade is everything. This idea is so important now that some tests are called ‘high-stakes’ tests, because their results shape us: they determine our careers, our access to higher education, our access to certain opportunities and places, and our socio-economic prospects (Torrance, 2017). The addiction to high-stakes testing is often framed by claims (which lack substantive evidence) that exams are fairer and more rigorous than any other type of assessment, so they present a more truthful, measured picture of academic achievement of which we can be more confident.
Assessment and its outcomes matter deeply to us, so I am concerned by a global lack of confidence in both policy and practice. This low confidence comes from poor understanding of two things: what assessment is and how assessment works. These two deficits have preoccupied me for some time, and this book is an attempt to present some answers to each of them in an accessible, evidence-based way.
When I tell people that my work is in educational assessment, their response is either a barely disguised yawn or, more commonly, a barrage of questions about why national testing and standards have collapsed. Despite the notion that assessment is not a very interesting topic, it appears to preoccupy a great deal of public interest. It is time for an honest, clear explanation and conversation about its key constituents, while also challenging some of the misconceptions that emerge in public settings. Testing, particularly the examination system, is often in the news.
This leads me to question how something so influential can be regarded with suspicion and framed by challenges and anxiety.
Views of assessment are broadly influenced by a complex series of discourses that surround our understanding of its development, use and outcomes. However, an examination of popular discourses within public domains reveals an unsatisfying binary level of argument – a love–hate relationship with the whole idea of assessment. We ‘love’ the certification and selection that the results of standardised testing bring, but we ‘hate’ the extent to which grading and measuring from the same tests has the capacity to influence opportunities and can lead to personal labelling.
Much of the vast range of assessment literature that has evolved since the 1990s comprises evidence of how formative assessment could challenge our reliance on testing as ‘the best’ form of assessment and demonstrates that assessment can be a way of informing and supporting learning. But despite a plethora of resources and global engagement with the idea of assessment for learning theories, when the chips are down we do not necessarily engage with formative assessments; we prefer to rely on grades to summarise ability, skills or knowledge. Such patterns of behaviour are not unique to England, but are seen from Canada to Kazakhstan, and from Slovenia to Hong Kong. Grades are a universally accepted way of characterising achievement and understanding success in academic terms.
Much research has been conducted on this theme and it reveals consistent patterns of anxiety and pressure. Obsession with exams and the continual promotion of competition as a foundation for a sense of educational achievement has been noted as problematic since the 1950s (Fielding, 2011). Yet we continue to repeat the cycle. In England, Reay and Wiliam (1999) found that national testing schemes in English statemaintained primary schools were leading children to judge themselves based on their scores. Children were literally describing themselves as a ‘four’, or even a ‘nothing’. Their scores referred to what was called the common attainment scale across the three key stages in education. These were numbered from 1 to 8, and the children in this study (aged about 10) were working towards a national average grade of 4, so anything below this would be considered a ‘failure’. The study suggested a need to change the concern and to focus on test outcomes as a measure of potential.
However, this unhealthy obsession with grading at a young age continues. It is implicit in the public messaging shown in Figure 0.1, which appeared on an advertising hoarding at the end of my road.
Clearly aimed at the teenagers who walk by it each day en route to the nearby secondary school, this advertisement promotes an online resource designed to provide support for anxious students. What surprised me about this is the order of concerns listed: exams are at the top of the list, outranking relationships – very different to my experience of teenage years at school!
There is an inconsistency in the perceived purpose of assessment clashing with a flawed understanding of a framework of educational achievement. Politicians and policymakers claim that our education system is now more sensitive than ever to the needs of all children, yet we accept a system of testing that is increasingly reductive. Those who create and produce our high-stakes examinations claim that such assessments provide balanced ways of capturing how students demonstrate knowledge, skills and/or understanding in the subjects they study in school. In terms of test construction, reliability and validity, this may be so, but how these tests demonstrate the achievements of individuals is more ethically troubling. Teachers are increasingly forced to focus their students’ attention on grades and not necessarily because they matter to the student. Chapters 1 and 2 explain this issue and introduce the continual quest for an elusive gold standard.
This book is not an attempt to identify and challenge all of the ways in which we talk about educational assessment. Instead, I explore them using what I have identified as dominant discourses on screen, in print and online. There are literally hundreds of thousands of articles that analyse assessment in a range of ways – from the social and political, through policymaking, to technical construction and classroom practice. However, I am interested in how assessment is discussed broadly too. Look beyond the limited readerships of academic publishing and there are so many public discourses about this issue. There is no single, correct interpretation of those beliefs and perceptions that circulate how we talk about assessment, and I’m not seeking to reveal the right way that it should be undertaken. Rather, I want to try to understand the prevailing discourses, so that there are other ways to reflect on what is happening in this controversial and contested area of education.
About the Author
Mary Richardson is a Professor of Educational Assessment at IOE, UCL's Faculty of Education and Society.