Well, it's May again and that can only mean one thing: pretty soon, education outlets around the country are going to start lambasting, moaning, debating and (sometimes) inventing things about the Key Stage 2 SATs.
I myself am not beyond reproach when it comes to such things. See my article in the TES, the follow-up article a year later and several others. Well, this year, given that my recent posts have been a little light on the research of previous blog posts (ah, remember when I had time and not a 1-year-old...?), I decided I would condense one of my A*-grade essays (not that I would ever brag) from my Masters in Academic Research degree. In it, I explore the sample testing of Science in Key Stage 2: the reliability of the results; the validity of the tests; the impact on the curriculum, that sort of thing.
It's a good read, I promise, but I know we are all time poor so I won't be offended if you save it for later. In a nutshell, it is thus:
The Science sample tests in England are a valuable tool for teachers and policymakers. They provide information on student attainment in science, and they can be used to identify areas where students need more support. The tests are also designed to be reliable and valid, which means that the results can be trusted.
The Science sample tests can be used in a variety of ways. Teachers can use the tests to:
There are a number of advantages to the Science sample tests. First, they provide a valuable source of information about student attainment in science. Second, they can help teachers to identify students who are struggling academically. Third, they can help teachers to plan lessons that are tailored to the needs of their students. Fourth, they can help teachers to report on student achievement to parents and guardians. Fifth, they can help policymakers to monitor the progress of students in science and identify areas where students need more support.
There are also a few disadvantages to the Science sample tests. First, they can be time-consuming for teachers to administer. Second, they can be stressful for students to take. Third, they can lead to teaching to the test.
One of the main concerns about the Science sample tests is that they can lead to teaching to the test. This is when teachers focus on teaching students specific strategies for answering test questions, rather than teaching them the material in a deep and meaningful way. Teaching to the test can be educationally damaging because it does not help students to develop a deep understanding of the material.
There are a number of things that can be done to reduce the risk of teaching to the test. First, teachers need to be aware of the problem and be committed to teaching science in a deep and meaningful way. Second, schools need to provide teachers with the resources they need to teach science effectively, such as access to high-quality curriculum materials and professional development opportunities. Third, policymakers need to create an environment that supports teaching science effectively, such as by providing schools with adequate funding and by holding teachers accountable for student learning, but not in a way that encourages teaching to the test.
Another concern about the Science sample tests is that they can lead to curriculum narrowing. This is when teachers focus on teaching the material that is likely to be covered on the test, rather than teaching the full range of the science curriculum. Curriculum narrowing can deprive students of the opportunity to learn about important topics that are not covered on the test.
There are a number of things that can be done to reduce the risk of curriculum narrowing. First, teachers need to be aware of the problem and be committed to teaching a broad and balanced science curriculum. Second, schools need to provide teachers with the resources they need to teach science effectively, such as access to high-quality curriculum materials and professional development opportunities. Third, policymakers need to create an environment that supports teaching science effectively, such as by providing schools with adequate funding and by holding teachers accountable for student learning, but not in a way that encourages curriculum narrowing.
The Science sample tests can also be a source of stress for students. Students may feel pressure to do well on the tests, and this can lead to anxiety and stress. There are a number of things that can be done to reduce the stress of the Science sample tests. First, teachers need to help students to understand that the tests are just one part of their overall science education. Second, schools need to provide students with support, such as stress management techniques and access to mental health professionals. Third, policymakers need to create an environment that reduces the pressure on students to do well on standardised tests.
The Science sample tests are a valuable tool for teachers and policymakers. However, it is important to be aware of the potential limitations of the tests, such as the time they take to administer, the potential for teaching to the test, and the stress they can cause for students. Overall, the Science sample tests are a useful tool for improving science education in England.
If that has piqued your interest and you would like to read the whole essay, let me know and I'll send you a copy. I've left a list of the references below (just like old times!).
In the meantime, whatever your school's SATs results, remember that you did your best; the children did their best; and that, ultimately, it doesn't matter a jot!
Ahmed, A., & Pollitt, A. (2011). Improving marking quality through a taxonomy of mark schemes. Assessment in Education: Principles, Policy & Practice, 18(3), 259–278.
Alt, D. (2018). Teachers’ practices in science learning environments and their use of formative and summative assessment tasks. Learning Environments Research, 21(3), 387–406.
Barton, C. (2019, May 1). Talking my language (No. 1.1) [Podcast]. In Inside Exams. AQA. https://www.aqa.org.uk/inside-exams-podcasts/episode-1
Bew, P. (2011). Independent review of key stage 2 testing, assessment and accountability : final report [79 p. ; 30 cm.]. Stationery Office.
Black, P., Harrison, C., Hodgen, J., Marshall, B., & Serret, N. (2010). Validity in teachers’ summative assessments. Assessment in Education: Principles, Policy & Practice, 17(2), 215–232.
Brookhart, S. M. (2013). The use of teacher judgement for summative assessment in the USA. Assessment in Education: Principles, Policy & Practice, 20(1), 69–90.
Brown, M., McCallum, B., Taggart, B., & Gipps, C. (1997). The Validity of National Testing at Age 11: the teacher’s view. Assessment in Education: Principles, Policy & Practice, 4(2), 271–294.
Child, S., Munro, J., & Benton, T. (2015). An experimental investigation of the effects of mark scheme features on marking reliability. Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment. https://www.cambridgeassessment.org.uk/Images/417277-an-experimental-investigation-of-the-effects-of-mark-scheme-features-on-marking-reliability.pdf
Cobern, W. W., Schuster, D., Adams, B., Skjold, B. A., Muğaloğlu, E. Z., Bentz, A., & Sparks, K. (2014). Pedagogy of Science Teaching Tests: Formative assessments of science teaching orientations. International Journal of Science Education, 36(13), 2265–2288.
Department for Education. (2016). National curriculum in England: science programmes of study. https://www.gov.uk/government/publications/national-curriculum-in-england-science-programmes-of-study/national-curriculum-in-england-science-programmes-of-study
Dylan Wiliam. (2020, May 2). What every teacher needs to know about assessment. Youtube. https://www.youtube.com/watch?v=waRX-IOR5vE
El Masri, Y. H., Ferrara, S., Foltz, P. W., & Baird, J.-A. (2017). Predicting item difficulty of science national curriculum tests: the case of key stage 2 assessments. The Curriculum Journal, 28(1), 59–82.
Emerson, R. W. (2019). Cronbach’s Alpha Explained. In Journal of Visual Impairment & Blindness (Vol. 113, Issue 3, pp. 327–327). https://doi.org/10.1177/0145482x19858866
Frey, B. (ed ). (2018). Construct Irrelevance. The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. https://methods.sagepub.com/Reference//the-sage-encyclopedia-of-educational-research-measurement-and-evaluation/i5811.xml
Gipps, C. (2011). Beyond Testing (Classic Edition): Towards a Theory of Educational Assessment. Routledge.
Green, S. (2002). Criterion referenced assessment as a guide to learning-The importance of progression and reliability. Association for the Study of Evaluation in Education in Southern Africa International Conference Johannesburg, 10–12.
Harlen, W. (1999). Purposes and Procedures for Assessing Science Process Skills. Assessment in Education: Principles, Policy & Practice, 6(1), 129–144.
Harlen, W., Crick, R. D., Broadfoot, P., Daugherty, R., Gardner, J., James, M., & Stobart, G. (2002). A systematic review of the impact of summative assessment and tests on students’ motivation for learning. http://storre.stir.ac.uk/handle/1893/19607
He, Q., Anwyll, S., Glanville, M., & Opposs, D. (2014). An investigation of measurement invariance of the Key Stage 2 National Curriculum science sampling test in England. Research Papers in Education, 29(2), 211–239.
Isaacs, T. (2010). Educational assessment in England. Assessment in Education: Principles, Policy & Practice, 17(3), 315–334.
Janan, D., & Wray, D. (2012). Guidance on the principles of language accessibility in National Curriculum assessments: research background. https://core.ac.uk/download/pdf/9589254.pdf
Kellaghan, T. (1996). The Use of External Examinations to Improve Student Motivation. American Educational Research Association.
Kirton, A., Hallam, S., Peffers, J., Robertson, P., & Stobart, G. (2007). Revolution, evolution or a Trojan horse? Piloting assessment for learning in some Scottish primary schools. British Educational Research Journal, 33(4), 605–627.
Koretz, D. (2005). Alignment, High Stakes, and the Inflation of Test Scores. In Yearbook of the National Society for the Study of Education (Vol. 104, Issue 2, pp. 99–118). https://doi.org/10.1111/j.1744-7984.2005.00027.x
Koretz, D. M. (2009). Measuring Up. Harvard University Press.
López-Pastor, V. M., Pintor, P., Muros, B., & Webb, G. (2013). Formative assessment strategies and their effect on student performance and on student and tutor workload: the results of research projects undertaken in preparation for greater convergence of universities in Spain within the European Higher Education Area (EHEA). Journal of Further and Higher Education, 37(2), 163–180.
Massey, A. J. (1995). Criterion‐related Test Development and National Test Standards. Assessment in Education: Principles, Policy & Practice, 2(2), 187–203.
Messick, S. (1994). The Interplay of Evidence and Consequences in the Validation of Performance Assessments. Educational Researcher , 23(2), 13–23.
Ofqual. (2014). Measurement Invariance of the Key Stage 2 National Curriculum Science Sampling Test in England (No. 17/5376).
Participation, E. (2018). Equality Act 2010. Legislation. Gov. Uk. http://www.legislation.gov.uk/ukpga/2010/15/contents
Preece, P. F. W., & Skinner, N. C. (1999). The National Assessment in Science at Key Stage 3 in England and Wales and its Impact on Teaching and Learning. Assessment in Education: Principles, Policy & Practice, 6(1), 11–25.
Qualifications and Curriculum Authority. (2003). SATs Papers Online. SatsPapers.org. http://www.satspapers.org/scienceKS2SATS.htm
Reise, S. P., Widaman, K. F., & Pugh, R. H. (1993). Confirmatory factor analysis and item response theory: two approaches for exploring measurement invariance. Psychological Bulletin, 114(3), 552–566.
Spurgeon, S. L. (2017). Evaluating the Unintended Consequences of Assessment Practices: Construct Irrelevance and Construct Underrepresentation. Measurement and Evaluation in Counseling and Development, 50(4), 275–281.
Stage, K. (2014). Key stage 2 science sampling test framework (draft). http://satspapers.org/SATs%20papers/2016%20samples/2016%20Sample%20Science/2016_Key_stage_2_Science_sampling_Test_Framework.pdf
Standards, & Testing Agency. (2014a, March 31). Key stage 2: English reading test framework. GOV.UK; GOV.UK. https://www.gov.uk/government/publications/key-stage-2-english-reading-test-framework
Standards, & Testing Agency. (2014b, March 31). Key stage 2: science sampling test framework. GOV.UK; GOV.UK. https://www.gov.uk/government/publications/key-stage-2-science-sampling-test-framework
Standards, & Testing Agency. (2016, January 28). Key stage 2 science sampling 2014: methodological note and outcomes. GOV.UK; GOV.UK. https://www.gov.uk/government/publications/key-stage-2-science-sampling-2014-methodological-note-and-outcomes
Standards, & Testing Agency. (2017a, July 20). Key stage 2 science sampling 2016: methodology note and outcomes. GOV.UK; GOV.UK. https://www.gov.uk/government/publications/key-stage-2-science-sampling-2016-methodology-note-and-outcomes
Standards, & Testing Agency. (2017b, September 28). 2016 key stage 2 science sampling: sample questions, mark scheme and commentary. GOV.UK; GOV.UK. https://www.gov.uk/government/publications/2016-key-stage-2-science-sampling-sample-questions-mark-scheme-and-commentary
Standards, & Testing Agency. (2018a). 2016 key stage 2 science sampling: modified test materials. https://www.gov.uk/government/publications/2016-key-stage-2-science-sampling-modified-test-materials
Standards, & Testing Agency. (2018b). Teacher assessment exemplification: KS2 science. https://www.gov.uk/government/publications/teacher-assessment-exemplification-ks2-science
Standards, & Testing Agency. (2019). 2020 key stage 2: assessment and reporting arrangements (ARA). https://www.gov.uk/government/publications/2020-key-stage-2-assessment-and-reporting-arrangements-ara
Stobart, G. (2009). Determining validity in national curriculum assessments. Educational Research, 51(2), 161–179.
Strand, S. (2007). Minority ethnic pupils in the Longitudinal Study of Young People in England. DCSF Research Report RR-002. Department for Children, Schools and Families.
The Strategy and Communications Department of The National Union of Teachers. (2010). Common Ground on Assessment and Accountability in Primary School,. chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/http://www.learnersfirst.net/private/wp-content/uploads/Resource-Common-Ground-on-Assessment-and-Accountability-in-Primary-Schools.pdf
Wiliam, D. (2001). Reliability, validity, and all that jazz. Education 3-13, 29(3), 17–21.
Withey, P., & Turner, S. (2015). The analysis of SATS results as a measure of pupil progress across educational transitions. Educational Review, 67(1), 23–34.