Why we need the SATs (pt2)

Hello everyone and isn't the weather lovely today (assuming you're reading this on either a) Friday, May 7th, or b) a lovely day)! Before I begin with this week's post, I just was want to mention that I was contacted by a lovely reader called Amelia, who, after reading my post about creating digital escape rooms, suggested an improvement to my QR Code generator of choice.

Having tried out her suggestion, I can honestly say that it is now my new favourite place to generate free custom QR codes. There are a lot more free tools to choose from - I had a play around with the frames but ended up going without - and you can change the colour of the QR code (the preview will even warn you if the contrast will be an issue when applied). For free! So if that sort of this is interesting to you (and if you've read or tried any of my digital escape rooms, then it should be!), have a look at www.websiteplanet.com.

Back to this week...

Today's post is a continuation of my previous entry so, while you can probably get along just fine with only reading this one, it might be better contextually to give part one a read first.

Before I get into the remaining five types of bias and my rationale for not using teacher judgment alone as an assessment guide, there has been an update in the word of assessment guidance!

Cath Jadhav, the Director of Standards and Comparability at Ofqual, who previously worked for the AQA and AEB exam boards, has said that, following submission of grades, "exam boards will review all grades as part of their quality assurance." [1 - emphasis added]. Correct me if I'm wrong but is this not simply moderation? A lot of people are very upset about this. The chief executive of the TEAL multi-academy trust called it a "waste of time" [2] and the chief executive of Advantage Schools said there was no point in submitting teacher-assessed grades beyond allowing Ofqual to look like they were "doing something." [3] Further arguments against this moderation are that it flies in the face of Gavin Williams promising to "trust teachers" [4]. But does it?

No-one is saying that Williams no longer trusts teachers, indeed, if that were the case, then exams would surely be reinstated. Instead, I think this is a case of trying to keep everybody happy. And let's not forget, the number one reason for assessment in schools is to hold schools accountable for progress [5]. Is it truly unreasonable to ask for some evidence to back up any claims of achievement that schools submit? Without some degree of moderation, there is no scope to:

ensure that measures are put in place to address any nationwide gaps caused by the pandemic
curtail any grade inflation (it'll happen - schools are now a commercial enterprise and looking good is necessary to attract 'clients')
prove how well the children have done

Again, like last week, this latest guidance is not for Primary education, which is probably a good thing because moderation is already an expectation of many Primary schools. And I know, Secondary schools and MATs probably internally and collaboratively moderate as well but, if they do, then I don't see why they're making such a fuss over potentially submitting a few grades.

Anyway, that's all happened since I wrote part one and none of it is about primary schools... but it sort of is as well.

Primary schools are under no obligation to formally assess any children this year nor are they required to submit their teacher assessments. And this brings us to the final five causes of bias and the final five reasons why externally graded tests are a good thing.

Here we go...

To recap from last week, we have already looked at the potential dangers of Gender, Affinity, Attribution and Beauty biases. Next up:

Confirmation Bias

This old chestnut is when we go with our gut. If we assume that someone is going to do badly, we will look for them to do badly, sometimes at the unconscious decision to overlook when they have done well.

I read an account in a book (I think it was It's Your Time You're Wasting: A Teacher's Tales of Classroom Hell, Frank Chalk, but I can't be sure) of a supply teacher who goes in to teach a top set biology class and is surprised by how slow they seem to be progressing. They're getting the idea of things and by the end of the hour, all (or most) have completed the activities and understood the concepts. When the teacher comments that he would have expected a top set class to have worked a little more quickly, they are told by the children that they are actually the bottom set and hadn't covered the material before. The moral of the tale being not to pre-judge a situation or a person because they can surprise you.

Studies and psychologists have linked confirmation bias with stereotyping (negatively and positively) [6,7] and this is, unfortunately, still a problem in schools. If you disagree, I want you to ask yourself if you have ever heard any of the following:

Girls are better at English than boys.
Boys are better at maths than girls.
Biology is more of a female subject.
Girls are more empathetic.
Boys are not so good at talking and discussion.
Girls chat too much.
Child X's parents were no good at maths, so it's no wonder child x is struggling.
They never do their homework.
Mind you, have you seen their parents? It's not surprising that...

And we could go on. Confirmation bias is alive and well in Primary education and we shouldn't be in the position where we have to mitigate for it. It's too hard! People don't like to admit they're wrong so they seek out ways that will prove them right [8]. If you still think I'm wrong, ask yourself if you have ever been surprised by a child's test score. If the answer is yes, then you are guilty of confirmation bias... and being human.

The Halo Effect

This is where your initial impressions of a child can continue to affect your judgment of them for days and months to come [9]. Did a child impress you during the first term? Then you are more likely to consider them favourably for the rest of the year. This even extends to their siblings or parents. If you have positive associations with a child, you are more likely to look upon their actions and results favourably.

Again, it is so easy to fall victim to this when considering teacher assessment. How many times have you been marking a piece of work, read an answer you know to be wrong but thought, 'oh, I know what they meant' and given them at least a partial mark? This is the halo effect in operation [10]. It also applies if you have ever, intentionally or otherwise, adjusted a child's idea to make it more correct. The old 'I think you mean...' or 'If you mean... then yes, you're right!'.

Like any of these biases, they aren't inherently bad but they are also not a reliable way to assess a child's progress and provide a grade that accurately reflects their attainment levels.

The Horns Effect

This is equal but opposite to the Halo Effect so I won't dwell. Needless to say, we have all had a child with whom we have clashed. That child who is on the borderline between 'Exceeding' and 'Expected' but because they have annoyed us one time too often, we put them down as 'Expected' because they need a wake-up call, or they need bringing down a peg or two, or they need to check that arrogant streak. We can justify this by telling ourselves that, if they truly are that clever, then it won't matter because they will be fine next year or in the next school. And, let's face it, they probably will. The likelihood of one slightly bad grade, a grade that genuinely could have gone either way and still be justified, being so catastrophically detrimental to a child's educational and emotional wellbeing is slim. It could happen but even I would argue that such a child has bigger problems if that's the case.

However, and I know I sound like a corrupted MP3 file (a reference that, even now, is outdated, despite my trying to be cool), it shouldn't be up to us to make that decision. If we know the child on a personal level, we should not be able to judge their academic progress in isolation.

The Contrast Effect

This is probably the least concerning bias on the list if I'm honest. This is where we judge children against each other instead of against defined criteria. I think moderation meetings are guilty of this but I also think that, with so little guidance, we need something to go on and direct comparison may as well be it. I also know for a fact that it is (sort of) how grade boundaries are defined [11].

It becomes problematic when we are so entrenched in groups and cohorts that we start to ship children into bunches of 'they're pretty much the same level as X. In the pre-2014 days of APP, I was told by a Local Authority representative to three children (top, middle, and bottom) and group the others around them, widely casting an achievement net around the three sections of the class.

More recently, while using some target tracking software, I was told to group the children into rough abilities and bulk-tick the targets even if they hadn't all achieved them. Obviously, this would vary from school to school but I am confident that most teachers, being short on time and heavy on pressure would admit to this behaviour, regardless of how much we know we shouldn't.

Conformity Bias

People are herd animals. We like to fit in. We watch the same TV shows so that we can join in the conversation in the staff room. We check social media to ensure that we are not too much of an outlier. It's evolutionary: the lone wolf dies; the pack survives. So when a colleague whom we respect tells us that a child does or doesn't deserve a particular grade, it's hard not to be influenced by it.

That's pretty obvious though, more alarming is the propensity of teachers to unconsciously punish children whom they see as non-conformists [12]. The children who have 'silly ideas' or who 'deliberately disrupt' the lesson. These are children who don't conform to the social norms and cues that we would expect and so, we decide that they have not achieved as well as they should have.

I'll be honest, this one surprised me because I hadn't considered this angle. But it makes sense, doesn't it? Even at a fundamental level - that of answering the question. If a child is tasked with retelling Red Riding Hood as a newspaper report but they instead write one about Rapunzel, have they failed the task? They haven't answered the question so can we award them a grade?

But if we look at the elements behind the assessment, the features of a newspaper report; the idea that they adapt a traditional tale; the fact that they are considering their audience in their writing, then they may well have ticked a lot of boxes. But they didn't do what you asked - they either didn't read the question or didn't understand (then they should have asked... but not under test conditions, that would be wrong). Or maybe they didn't write a newspaper report at all but re-wrote Red Riding Hood in a modern setting where the wolf is a secret government agent and the grandmother is a terrorist assassin looking to recruit the vulnerable young Red? So their grade should reflect that lack of basic understanding. But what they have written is really good...

If you're anything like me, you would go and ask a colleague for their advice on the situation.

Because conformity bias is sneaky and real.

Here's the thing: none of these biases exists in a bubble and I don't think that any of them are, on the whole, conscious actions. But to ignore their potential is problematic at best. Primary teachers spend so much time with their classes, all day, every day, five days a week. I don't think Secondary educators quite understand the bonds that are formed. Many primary schools teachers will have known the children since they were four years old and it can be difficult to shake that first impression. Secondary teachers see the children for the first time as young teenagers - it's simply not the same.

We are their school-mums-and-dads. We walk that painfully narrow line between friend and mentor. We want, desperately, for each and every one of them to succeed so hard that they surprise themselves and everyone around them. We don't just want the best for them, we want them to redefine what 'the best looks like. Of course we're biased! Most of us positively but it doesn't matter. What matters is that we cannot be trusted to impassionately judge these children. Let's be honest, our children (no child refers to you as 'the' teacher, they call you 'my' teacher). So I think it is truly unfair to put us in a position where we have to do so. And yes, the same could be said of Secondary teachers, especially as they are submitting teacher grades as well, but a lot of Secondary schools are having the children sit externally moderated exams as well, and picking the best of the two grades. Primary doesn't have that option. Many Primary schools are planning to use the 2019 SATs later in the Summer Term but these, a) won't be moderated externally, and b) could very easily be leaked ahead of time rendering any data gathered from them completely invalid.

What's worse is that there isn't even any guidance on how best to go about it. I'm sure most schools will internally moderate, and schools who are part of a MAT may well intra-moderate, but there is no obligation to do so. Smaller schools are free to cherry-pick the evidence that best suits their decision with potentially no moderation at all. And all schools are free to self-define what 'independent' looks like. With zero national normalization, and seemingly no risk (the Department for Education seems to only care about Secondary schools and higher), we could be condemning these children to a legacy of grades they didn't earn.

But maybe it doesn't matter. Maybe assessment at this young of an age is irrelevant anyway. Perhaps we should use this pandemic as an opportunity to abolish any sort of national testing and comparison altogether. 'They're only babies' is a protestation I have heard all too often, 'they shouldn't have to take exams.'

But if we're not going to check what they can do; if we're not going to hold them accountable for their learning; if we're not going to monitor knowledge gaps and adjust our teaching to ensure that everyone gets a fair turn of the wheel, then we may as well just let them colour in. That's what people think we do anyway, why fight it? If Primary education is just glorified babysitting then let's help them justify our lack of pay rise by simply babysitting.

Or maybe Primary teachers, and the children they educate, deserve better. Maybe the teachers deserve government guidance. Maybe the children deserve the chance to prove that they can achieve even in the face of overwhelming difficulties. Maybe we all deserve the opportunity to submit grades that are beyond reproach or accusations of inflation. Maybe national testing is a useful tool for informing future education. Maybe, just maybe, we need the SATs.

That's me done for this week. It's taken a lot of time to write this because I had to do a lot of research (one of the downsides of completing an MA is that I am no longer able to simply write something - I now have a nagging voice in my head saying prove it, or citation, maybe?). I hope you've found it interesting if nothing else. I have to admit that when researching the different kinds of bias, I was remembering times in my career when I have been guilty of pretty much all of them. However, I don't think that the aim is to completely erase bias from the profession - I don't think that's realistic, especially since they are unconscious biases. No, I think the aim should be for more heightened awareness of the possibility of bias. To maybe not be so quick to defend our decisions but instead, happily discuss differing opinions. I also don't think that SATs on their own are a good idea. There should be a mix of Teacher Assessment and external, formal assessment (like the Secondary schools are being offered this year). At least that way, we have a barometer for our decision-making. If a child performs unusually well on national tests (some children do), then we should be able to use that score to review past performance and start a discussion on why that child's grades were so low throughout the year. Similarly, if we know that a child deserves a higher score based on previous evidence, then we should be able to submit that higher score.

I'll close with an anecdote from Willy Wonka himself: Gene Wilder (don't you dare mention Depp #NotMyWonka!). When asked about working with Mel Brooks on Young Frankenstein, Wilder recalled the Puttin' on the Ritz sequence, which was his idea and he was very keen on it.

Brooks, so the story goes, said that it was a terrible idea and he wouldn't include it in the movie. Wilder argued its merits, insisting that the bit was funny and that the film's ending would suffer without it. This went on for quite a few minutes, with Wilder insisting and Brooks refusing and Wilder insisting more intensely. Eventually, out of nowhere, Brooks agrees to add the bit. Just an 'okay, sure, it's in'.

Wilder was confused and asked why Brooks put him through the ordeal of fighting his corner for a silly bit of comedy that he was going to approve anyway. Brooks's reply is where all of this ties in with why moderation and assessment are essential. Brooks said:

I wasn't sure if it was right or not, and if you didn't argue for it, then I knew it would be wrong. But if you really argued for it, then I knew it would be right.

Watch the interview clip here

I think that's the goal. Knowing that the level of education, and therefore the level of ongoing support and challenge, that we give a child is accurate enough to withstand arguments to the contrary. That's what we don't have at the moment.

Anyway, I've written about three different endings to this post and I have to get it published before the end of the week so I had better stop here. Thanks for reading, if you have. Thanks for reading last week's part one, if you did. And thanks for recommending my blog to people, if you do! I've been writing for around almost two years now and I am always humbled when I see that people are taking time out of their busy day to read my musings.

A massive shout-out to everyone reading in Hong Kong, the USA and Indonesia - I had no idea I was being read by people who are so far away from me! And to everyone who reads either regularly or semi-regularly, if you have anything you would like me to look into, or explain, or review, anything at all to do with the world of education, please get in touch. I would love to hear from you.

One more thing...

My new website is live! It's still a little rough around the edges and my team and I are refining things as we go but it is live for all the world to see. Check it out for yourself at www.igniteeducation.co.uk. My free resources will be migrating over there soon (for the time being, they are still over at Carl's Learning Place); you can book me for some private teaching, either individually or as part of a small group (live or via the web) and my books will make there way there as well very soon. Exciting times!

Thanks again and I wish you many 'cautious hugs'* from next week!

Carl Headley-Morris

Email me! Visit my website! Tweet me! Try my stuff!

*In the UK, from May 17th the government have approved larger social gatherings and condoned 'cautious hugging' [13], I'm not just being weird.

References used in this post:

[1] https://ofqual.blog.gov.uk/2021/04/22/quality-assurance-for-gcse-as-and-a-level-information-for-schools-and-colleges/

[2] & [3] https://schoolsweek.co.uk/gcse-and-a-levels-ofqual-reveals-quality-assurance-evidence-requirements/

[4] https://www.tes.com/news/exams-2021-gcse-alevel-gavin-williamson-trust-teachers-parents

[5] https://www.gov.uk/government/publications/key-stage-2-mathematics-test-framework

[6] Carl Metzgar. "Confirmation Bias: A Ubiquitous Phenomenon in Many Guises." Professional Safety 58.9 (2013): 44. Web.

[7]Julie A. Nelson (2014) The power of stereotyping and confirmation bias to overwhelm accurate assessment: the case of economics, gender, and risk aversion, Journal of Economic Methodology, 21:3, 211-231, DOI: 10.1080/1350178X.2014.939691To link to this article: https://doi.org/10.1080/1350178X.2014.939691Published online: 06 Aug 2014.Submit your article to this journal Article views: 2581View related articles View Crossmark data citing articles: 22 View citing articles

[8] https://explorable.com/confirmation-bias#:~:text=Wason's%20Rule%20Discovery%20Test%20proves,but%20rather%20to%20confirm%20them.&text=They%20argued%20that%20the%20behavior,as%20a%20positive%20test%20strategy

[9]Parents' and Teachers' Ratings of the Creativity of Children

Runco, Mark A.Journal of Social Behavior and Personality; Corte Madera, CA Vol. 4, Iss. 1, (Jan 1, 1989): 73.

[10] Grcic, J., 2008. The halo effect fallacy. Electronic Journal for Philosophy, 15(1), pp.1-6.

[11] https://www.aqa.org.uk/about-us/what-we-do/getting-the-right-result/how-exams-work/making-the-grades-a-guide-to-awarding/making-the-grades-video-transcript#:~:text=The%20process%20for%20deciding%20grade,overseen%20by%20the%20qualifications%20regulator.

[12] Christina Lynn Scott (1999) Teachers' Biases Toward Creative Children,Creativity Research Journal, 12:4, 321-328, DOI: 10.1207/s15326934crj1204_10

[13] https://www.reuters.com/world/uk/uk-pm-johnson-announce-covid-lockdown-easing-junior-minister-says-2021-05-10/

Mr M's Musings...

Search This Blog