Writing Assessment: There And Back Again (Part 1)

therebackagainFor almost two years, we have been experimenting with writing assessments and trying to make them more meaningful. We have gone from criterion scales to best fit, to NC objectives, to nothing, to comparative judgement. And now, here we are at yet another attempt to find the balance between workload and the usefulness of primary writing assessments. We haven’t even started our new idea yet, but I would welcome suggestions and potential pitfalls along the way. Hence, I’ll be sharing our idea in Part 2.

I know that many colleagues are in the midst of trialling CJ. I have written a few blogs about our experience here. I have also been in contact with several schools over setting it up as well as some of the things you can do with it (I think I’m right in saying we were one of the first to try out anchoring for showing progress). More recently, I have been asked by a few other schools if we would like to do cross-school moderation/assessment judgements with them. We aren’t using online CJ this year, but here’s a few reasons why you should consider it:


+ CPD – all staff get to see a whole lot of writing in the school

+ Can provide a ‘flavour’ of what writing is like throughout the school

+ Reliability score – no disputing the numbers or agreements (typically around 0.8)

+ Speed when moderating (this can’t be emphasized enough)


As you can see, we found many positives using online CJ. However, after we had discussed the outcomes. Shared the scripts with both staff and pupils. Explored the rankings. Examined the graphs. Looked at the extremities. Used the feedback from staff to identify next steps in writing for the school. Shared data with governors. You know what we found out? Nothing new. Our children aren’t great at grammar (which we knew). Lots carn’t spel (we knew this). Some children are more creative or have a stronger writing voice (which we knew) and some are a little more r-o-b-o-t-i-c (which we knew). A few can’t help but can’t help but repeat themselves (knew). Some have no ideas for story writing (we knew – they are usually the reluctant readers). Hardly any could make their colons the correct size! 😉


The issue with the way we did it, in hindsight (damn you, hindsight!) was that teachers didn’t know which scripts they were marking. ‘Duh, that’s the point!’ I hear you say. But actually, is it? Isn’t the point of assessment, for teachers, to find out about their class: what they can and can’t do? By setting up CJ the way we did, we removed this crucial aspect. So the staff went away with generic whole-school issues, still being blissfully unaware of what their class were capable of without reading through all of the scripts again (which defeats the purpose in the first place, right?). Let me emphasize, that this is our findings. I’m sure there are schools out there that have used CJ far more effectively than we did and I am very much watching this space.


It’s interesting to read that NNM (who have been great, by the way) are aware of some of my current feelings towards (writing) assessment:

Of course, you could argue that the traditional teacher assessment also provides pupils with regular useful feedback, whereas comparative judgement is just providing an intermittent grade. We’ll deal with this point more in future posts, but for now, briefly, we’d argue that the feedback pupils get from traditional assessment, often in the form of a written comment taken from the frameworks, is not actually that helpful.



In the cold light of day, it all boils down to (idioms galore): why are we doing this? Really, though. Think about your staff. Do they really need to go through *insert whatever form of writing assessment you currently do* to have a good understanding of what to teach next? Does the impact from said assessment process warrant the time it takes?  What about measuring progress? I’ll leave that one for James Pembroke to hammer home, here http://sigplus.blogspot.co.uk/2015/04/the-progress-myth.html and here http://sigplus.blogspot.co.uk/2016/06/the-progress-myth-revisited.html.


Why am I writing this blog? Well, I have an itch in the form of writing assessment at the moment. It’s the annoying itch in the middle of your back where, try as you might, you can’t quite scratch the right spot. I still love CJ and we are incorporating it into our new method of assessing writing, which I will be sharing in Part 2. But perhaps our school has missed a trick with it. And I don’t want to miss it. So if you have read this and are sat at your screen thinking, ‘Duh! What about x, y or z.’ I’d love to hear it. After all, we’re all in the same boat and it would be great to hear others’ views and ideas to develop assessment together, assidere style!


CJ, Anchors & Our Spring Assessment

So a few weeks ago, I blogged, somewhat controversially, on the Sharing Standards Experience. It did ruffle a few feathers but in a way, I’m glad. NoMoreMarking has made great steps in bringing to the assessment table something that could really help teachers. Refining the test conditions, presentation of scripts and the user interface is still, I believe, work in progress.

I’d like to share our spring data with you. Just to remind you that we were very specific in our test conditions:

  • Children from 1-6 wrote a narrative from a choice of 3 images in autumn term
  • Children from 1-5 wrote a diary entry from a choice of 3 stimuli in spring (Y6 were involved in Sharing Standards so )
  • No prior teaching, modelling or sharing similar pieces of writing
  • No pooling of ideas or any guidance was given.
  • No redrafting or feedback was given
  • Each child completed the writing during the morning session



In the autumn term we had identified narrative as an area of focus for our school. Consequently this was the task for the autumn CJ. Our assistant head, Carl Badger, then led training on improving narrative writing. We could have used CJ again to compare the before and after to measure the impact of the intervention. This would be one good use of CJ to measure progress over a specific area. But quite frankly, it wasn’t necessary. We didn’t need an assessment to see which children had or hadn’t improved. This was evidenced in books. Most had, some hadn’t.

So in spring we decided to give the children a diary entry as we wanted to have a look at non-fiction. Diaries can incorporate elements of narrative so although it isn’t ideal to compare the two, the text types aren’t a world apart.



So here it is, in all its glory!

progress cj

Thank you Chris @nomoremarking for producing this for me.

Overall we can see ‘progress’. I use this tentatively as we are only comparing two pieces of writing. We realise this isn’t enough to make firm conclusions about the learning across the school. Nevertheless it’s interesting! In case you don’t know, the dots are the extremes. On the face of it, Year 4 have made the most progress with Year 2 showing some rather peculiar (and extreme) outcomes for that task. They’ve obviously been untaught and the Year 2 teacher needs to go!


Because the top two scripts have a lot of personal information in, I cannot share them but here’s the highest script from Y4 (ranked 3rd overall) – a true reflection of how this child writes, completely independently:

spring best scriptspring best script 2

Even with this highly ranked script, it’s simple to pick out next steps for this child (language, paragraphing, punctuation range e.g. ‘four-sided’, handwriting, to name a few).


We used anchors from the autumn assessment to compare the spring scripts on a scaled score. This was the highest ranked script overall from the autumn (Y6 pupil) that became one of the anchors.

ladder of curiosity

We used a total of 8 anchors.


What now?

We have already begun trialling direct instruction to improve writing standards and will use CJ to compare prior to, and then at the conclusion of the intervention. I will be sharing the results of this.

At the end of this year we will choose another piece of writing to compare as part of our ‘testing the waters’ with a view to using CJ to compare portfolios in the next academic year. I’m still not convinced about trying to compare 6 different pieces of writing on screen all at once using CJ. We will either have set forms/genres that appear in the same order on-screen or will judge pieces 1 vs 1 and produce some sort of average.


Food for thought

School leaders and governors need to be able to talk about pupil groups, slow learners etc. as well as how well staff are performing. With this in mind, here’s a few questions our school is currently considering:

  • As with any assessment, for what purpose are you using CJ?
  • Who is this assessment for?
  • Does it reveal anything we don’t know already?
  • Will doing this assessment have a positive impact on learning (considering workload/time etc.)?


I have always maintained that I like CJ, and like all assessment systems, careful consideration should be given to why it is being used, how much it actually informs us, and the impact it will have on standards.

Comparative Judgement: The Sharing Standards Experience

Those who read my 2016 blog on CJ will know how interested I am in using CJ to improve our assessment in writing. Like other schools, we took part in the KS2 ‘Sharing Standards’ project. We were really pleased to be in a project where schools across the UK shared their Year 6 writing and have high hopes that it will be a vehicle for driving assessment strategies forward.

The difference to our first two internal writing assessments we have made using CJ was that we only assessed 30 children. However, the 30 children assessed had a portfolio (3 different pieces of writing in genre and form) to compare against. Whilst I understand the principle behind the thinking (a range of writing needs to be compared, not just one), for us it didn’t work as well. Here’s why…


Apples and Oranges

The first issue was that we weren’t comparing like for like. Some children had written a story, diary and report whilst others had submitted instructions, story and biographies. Even in the Mars and football example Chris Weadon often refers to, a report was compared to a story. Now whilst some may argue this shouldn’t matter, I would question how much prior information the ‘Mars’ child was given in order to write the report. Furthermore, when the scripts appeared on screen, they weren’t in any particular order of form or genre. Some had completely different genres. Both of which made it very difficult to compare.

Rather than:

cj colour shades 1

We were doing:

cj colours shades 2

Staff had to scroll back and forth to try and make comparisons from the sets of work. I could see the look of frustration on their faces. It wasn’t as quick as it should be. There wasn’t the same buzz in the room we had had previously. There was less discussion. It was hard. Overall, it was a completely different experience for staff compared to when we had used CJ before, and not in a good way.



In future, I would suggest that the scripts from each school were organised on the screen so that they appeared in a set order:

  • script 1, diary entry
  • script 2, newspaper report
  • script 3, story

This would make judging them through comparison much easier.

Another thought is to hold separate sessions so that genres/forms are ranked individually. So session 1: diaries; session 2: newspapers and session 3: stories. The ranks could then be averaged out to show the mean rank. I’m not amazing with data and there is probably a huge flaw in doing this but just putting it out there!


Final thought 

I still think CJ is good and it’s still in its (primary writing assessment) infancy. With more consideration it could be great. But, please, let’s not try and compare apples and oranges.



Comparative Judgement Day – exploring a different way of assessing writing

For many years now, I felt that the way writing was assessed in primary schools was wrong. But, ironically, as with judging writing itself, I couldn’t quite put my finger on why. With the more recent changes in assessments, came the secure-fit model whereby pupils have to show evidence of all of the standards, without exception. I won’t go into how much I despise this form of assessment other than that I have witnessed numerous situations where pieces of work are down-graded (or sometimes up) because of handwriting, or no evidence of the conjunction ‘but’ despite showing creative flair or voice. Is this really the best our country can come up with? Surely there is a better way…

Then I remembered watching the excellent (as always) speakers at the Beyond Levels conference in Sheffield. Ally Daubney & Prof. Martin Fautley spoke about music assessment and this got me thinking about writing. Specifically, how we are judging the art of writing as though it is mathematics. I found myself asking questions: Why do some people like certain pieces of music over others? Similarly, the same could be said about books. The 50 Shades of *insert your funniest replacement here* trilogy certainly captured the interest of reluctant adult readers like no other before them. A few years ago we even had a parent write a review of the book and say just how much the book has changed her life – I didn’t delve further!

So, what I am waffling on about is that although we know we have lots of great writers at our school, the way in which we were assessing them, internally, didn’t show this. Yes, our children could use fronted adverbials. Yes, they could use colons before a list and to provide further explanation to the previous independent clause. And yes, grammar and punctuation is very important in writing. But where many were falling down was when they had to produce their own ideas for writing, independently. And then of course, there were those adorable writers who had the messiest scrawl that were able to take you away into their own world without a passive voice sentence in sight. Don’t get me wrong, our school does an extremely good job with both attainment and progress at the end of KS1 and KS2. But it just felt wrong to use a form of assessment that emphasises grammar and spelling so heavily. We didn’t give enough credence to creativity.

Then I stumbled upon this blog by Daisy Christodoulou about comparative judgement. Then another excellent article from David Didau here (there are two further parts that are a must-read). I began blabbering uncontrollably about CJ to Carl (our assistant headteacher). This is it! It makes sense!  And so after a million-and-one emails to Chris, we finally judged our first session. This is how it went…


How did you prepare for your first judging session?

We first had a practise at judging in SMT. Carl and I uploaded some pieces of writing from the standards files to judge so that a few of us were familiar with the process. Next, we introduced CJ via a staff meeting. I found it helpful to play snippets of Daisy’s presentation from ResearchED2016 alongside our own thought-processes. Carl and I spent quite some time deliberating how best to assess the children. Eventually we decided:

  • The first assessment task would be based on narrative (we had originally chose a descriptive piece but decided against it)
  • Every child from years 1 to 6 would take part.
  • The whole school would write on the same day from the same stimulus (they had a choice of three images)
  • We clarified that no working walls were to be used; nor pooling of vocabulary or sharing ideas – basically, the children were given the stimulus and a piece of paper to plan from. We contemplated giving the children a planning format but decided against it – after all, they should be able to plan a story on their own, shouldn’t they?
  • We would only give the children two pages (or sides) to write on
  • Children were allowed as long as they wanted (within reason)
  • The QR code ‘answer sheets’ (lined paper for the pupils to write on) needed to be made larger for Year 1.
  • A handful of children were omitted from the assessments from Year 1 due to accessibility

Before we carried out our judging session we made sure all of the staff (including LSPs and apprentices) had emails and knew how to log on! This may sound daft, but I wanted everything to be as efficient as possible. Laptops were all set up and ready to go. Our staff meeting is an hour. I wanted staff to sit down, click a link and start judging right away. Which is pretty much what happened.

Note: I would recommend using iPads or tablets so as to use the zoom pinch feature.

How many scripts were judged by how many teachers, and how long did it take?

19 members of staff judged 66 scripts each. 176 pieces with 1195 scripts in total, in approximately 55 minutes with a reliability of 0.87.

What feedback did you get from the session? Did you feel that the judging itself was a useful experience?

Staff have been to many moderation sessions both internally and externally and we can honestly say this was the most pleasurable experience we have had. There was a real buzz in the room. Staff from opposite ends of the school were engaged in conversations about the children’s writing sharing snippets of quality, humour and down-right ridiculousness that comes from brilliant young writers. I will echo Daisy: using CJ to moderate achieves more in shorter timescale, with everyone agreeing, than traditional moderation ever did.

What have you learned from the data?

After the judging, Chris Weadon came to our school to help me answer some remaining questions I had about using CJ. Within five seconds of Chris opening up his laptop and showing me a graph of our results, I was ready to kiss him!

Below is a graph of result from our autumn assessments.


The dots are each individual pupil (note the Year 3 blank script that found its way into the scanner!). The white rectangle the 50% median and the black line the median.


So what does our data tell us?

Lots! Including:

  • There is a high performing Year 1 writer in the school who is actually writing at a similar standard to Year 5. This came a big surprise to us. We knew the pupil was a proficient writer, but not by such an amount.
  • Year 3 and 4 cohorts are generally writing at the same standard, as are Year 5 and 6 – consolidation anyone?!
  • There is a child in Year 6 who we hadn’t identified as being so ‘low’

We will not be using any of the data to bash staff over the head with!


How will you use the judging to help you improve teaching & learning?

Carl produced a feedback sheet for each staff member to complete after judging: what were good elements of writing? What were the main areas for development? This gave staff a focus and us an insight to exactly what different members of staff deemed to be signpost to good writing. Another question asked was, ‘When you came across two scripts that were of a similar standard, how did you finally make a decision?’ Further discussion is evidently needed.

Once we have collated the data, we will plan a whole school initiative around improving one or two key areas of development. For example, already it seems there is evidence to suggest that children throughout the school could use more lessons on narrative planning and plot structure as well as punctuation (when isn’t this the case?!).

Example scripts can be used in each year group classroom to share with pupils; make comparisons between structurally sound pieces of writing; and explore other points of interest such as looking at use of vocabulary or effective punctuation.


What will you do next?

We are planning to give children more opportunities to write freely. No success criteria, help with language etc. Of course, we will continue to explore texts to help improve writing and immerse children in high quality writing – much of which will be other children’s in the school, as opposed to within their own class.

A repeat of this assessment will be undertaken in the spring term. All children will write with special pens for their next assessment to redress the issue of not being able to read a small amount scripts clearly in Year 1. We had planned to use a non-fiction stimulus rather than a narrative but decided that we couldn’t make fair comparisons between the two. Chris suggested giving different children different forms and genres of writing. ‘If the pupils find different genres difficult then you may find that ‘progress’ is going down not due to the pupils’ performance but due to the difficulties of the tasks. If you mix them up you get a better overall picture but a less reliable picture at the individual level.’ Ideally, we will build up small writing portfolios for each child, across a range of genres, to make further judgements. Chris explained that we can anchor some autumn scripts to use as benchmarks for further assessments to begin looking at progress.

So is this really judgement day for writing assessment? It’s too early to say for sure but the outlook looks promising. Perhaps with this system we will finally reward the worthy.


Craig Westby is the Deputy Headteacher at Old Hill Primary School in Sandwell.