Could YOU understand the A-level grades algorithm? - Culture

The complex algorithm used to calculate a student’s A-level results this year is formed of a series of detailed stages looking at their school’s previous results.

Exam regulator Ofqual had initially decided grades would be based on teacher assessments of how pupils would have performed, moderated by exam boards.

But it became apparent that teachers’ grades were far too generous, so it adjusted them using the algorithm and considering schools’ results in the past three years.

The formula predicts the distribution of grades for each individual school or college, based on the historical performance of the institution in that subject.

It takes into account any changes in the prior attainment of candidates entering in 2020 compared to previous years, and issues such as schools with small cohorts.

The idea is that a school will perform the same in a subject this year as they have across recent years, taking into account changes in underlying ability of students.

These changes are based on a comparison of the prior attainment of students this year compared to the prior attainment of students in the historical cohort.

The idea was to stop this year’s pupils having higher grades than their near contemporaries and to keep their credibility among universities and employers.

To create the algorithm, Ofqual tested 12 different standardisation models and selected the most accurate, which it calls direct centre-level performance, or DCP.

Ofqual said it had been advised by some of Britain’s leading assessment and statistical experts and consulted widely following exams being cancelled in March.

Here are the nine stages as outlined by Ofqual in their algorithm this year:

STAGE 1 – How has a school performed in subjects in the past?

Experts began by identifying the historical performance of students for each school in each subject, providing a starting point for their predicted distribution of grades.

They looked at the cumulative percentage of students achieving each grade, with the distribution of this defined in the below equation.

In this formula, 0 represents ungraded, M represents the highest grade and D is the cumulative proportion grade distribution:

The number of years over which the historical data is put together is dependent on the qualification.

The historical grade distribution is noted as the below, where ‘Ckj’ means the percentage of students at school ‘j’ getting a grade ‘k’ or higher:

For example, a school with the following grades, entering for a subject for which the historical data is over three years, would have this historical grade distribution:

STAGE 2 – How have students done in the past at subjects?

Once the historical grade distribution is calculated, this can be adjusted based on the prior attainment of the pupils entering for the subject in each school.

Prior attainment data is often used by exam boards to set and maintain qualification standards, allowing differences in the ability of a cohort to be factored in.

For AS and A-level students, the prior attainment is defined as their mean GCSE score – and the student must be aged 17 for AS or 18 for A-levels on August 31.

They must also have valid prior attainment record and, where the mean average GCSE score is used as the predictor, have results in a least three subjects aged 16.

However, a student’s individual prior attainment does not dictate their outcome in a subject. It is only used to predict for group relationships between students.

The value-added figure is therefore the relationship between the prior-attainment distribution of a cohort and the distribution of results achieved by a previous cohort.

A prediction matrix identifies the cut-points to be used to divide students into prior-attainment categories, which is achieved by performing these two steps:

Identify all students in the historical data set across all subjects at that qualification level who a) have the relevant prior-attainment measure; b) meet any other selection criteria, such as age; and c) achieved a valid result in any subject.
Work out the cut-points that divide the cohort into deciles. These define the points on the scale where pupils are split into different prior-attainment categories.

The prediction matrix can then be built through the following steps:

1. Identify prior-attainment matched students in the historical data that have a result in the subject. Identify the prior attainment category into which each of these students fall such that:

2. Cross-tabulate these prior-attainment categories with the grades achieved by the students in the subject of interest. An example is shown below:

3. For each prior-attainment category, convert the grade distribution within that category to a cumulative percentage distribution.

This gives the probability of a student with that level of prior-attainment achieving each grade or higher in the subject. This is performed below for the values in step 2:

The result of step 3 is the prediction matrix reflecting the historical value-added relationship.

STAGE 3 – What is the historical ‘prediction’ for each student?

The cohort-level value added relationship, which is calculated above, can then be used to calculate an adjustment required for each school entering for the subject.

This adjustment is made to reflect differences in ability of the cohort entering for the subject with a school this year, compared to pupils at the same school in prior years.

The first step is to generate a historical prediction for each school entering the subject based on the prior attainment of their historical cohort as follows:

1. Identify all students in the historical data set with the relevant prior attainment measure and meeting any other criteria entered for the subject with each school.

2. For each school, split the prior-attainment matched students into categories based on the prior attainment cut-points identified in stage 2 above, such that:

3. For each prior attainment category for each school, determine the number of students who would have been predicted to achieve each grade.

This is done by multiplying the number of students in the prior-attainment category by the probability of students in that category achieving each grade or better.

4. For each grade, add up the number of students across the prior attainment categories predicted to achieve each grade or higher.

Steps 3 and 4 are carried out below based on the school example given in Stage 2.

5. Giving the sum of students predicted to achieve each grade or higher provides the historical predicted grade distribution for that school. This is defined as:

Using the example above, this results in:

STAGE 4 – What is the initial prediction for the current students?

The next step follows the same procedure as that described for Stage 3, but this time it is performed for the current cohort entering for the subject with each school.

The result of this process is the predicted grade distribution for the school in the subject this year, assuming the school follows a national value-added relationship.

To generate this prediction, Steps 2 to 5 within Stage 3 above are repeated for prior-attainment matched students entering for the subject with the school.

This is exemplified below, using the same school used above.

The resulting predicted grade distribution for this year’s cohort is denoted as:

In this example, it follows that:

STAGE 5 – What are the prior attainment match rates for schools?

Not all students entering for the subject will have prior attainment data, possibly because they did not sit the relevant assessments.

If a school were to have a low proportion of prior attainment matched students, it would be unfair for differences in the profile of students to strongly influence the prediction.

However, where the match rate is high in both the current year and in the historical data, it is more appropriate for differences to influence predicted outcome.

To determine the influence that the prior attainment information should have over the school-level prediction, the match rate this year and in the historical data should be considered. A summary measure of this match rate across years is defined as:

For the example presented above, the following figures result:

STAGE 6 – What is the predicted grade distribution for schools?

To form this year’s predicted-grade distribution for each school in the subject, the historical information from Stage 1 is adjusted based on predictions in Stage 3 and 4.

The weight that should be given to this adjustment as determined by the match-rate calculated in Step 5. The adjusted prediction is calculated as below:

The equation can be broken down as follows to explain its operation:

The first term in the equation above controls the amount of influence the raw historical outcome has over the prediction at the grade, based on the proportion of unmatched students.

If the lowest match rate (rj) across the historical and current years was 60 per cent, this term would contribute 40 per cent of the weight to the prediction.

The second term in the equation controls the influence of the prior attainment adjusted outcome; so in this example would contribute 60 per cent of the weight.

Therefore, where a school has no prior attainment matched students, the school-level prediction is defined entirely by the historical school outcome since rj = 0 leading to the second term collapsing to zero, resulting in {Pj}={cj}.

The output from this step is the school-level prediction for the subject.

Using the example above, this shows that the increase in the prior attainment profile for students in summer 2020 relative to previous years gives rise to an increase in the school-level prediction for the subject as shown below:

STAGE 7 – What are the suggested estimated student grades?

Having performed the steps outlined above, it is possible to produce notional grades for this summer’s students.

This is performed by overlaying the rank order provided by the school onto each school’s predicted cumulative percentage grade distribution.

The proportion of students awarded each grade within the school should match the predicted distribution as closely as possible.

The rank orders of students collected from schools were articulated separately for each grade.

For the purposes of this stage, the rank order for each school was restructured into a single contiguous rank order covering all grades.

This restructuring retains the overall ordering of students within the school. The resulting distribution for the example given is provided below:

STAGE 8 – How do you ensure the grading is credible?

Cohort-level statistical predictions play an important role in ensuring that the overall distribution of grades is credible, to avoid inadvertent severity or leniency.

Therefore the information should be turned into imputed marks, to identify the students who would be most likely to move up or down a grade.

A mark scale with notional cut-scores is then constructed. It has 100 marks per grade and ranges from 0 to M × 100, where M is the number of grades available.

Notional cut-scores for each grade therefore occur at 100-mark intervals – for example at A level, the A* notional cut-score is 600, and A is 500 etc.

The notional grade for each student is used with their positions in both the rank order submitted by their school and the predicted grade distribution, Pj.

To perform this imputation, the following calculation is carried out:

The effect of applying this equation is to space students evenly across the mark range available for that notional grade.

The mark for the lower most student at each notional grade is calculated dependent on the difference between the actual predicted grade distribution and the notional grade distribution that it was possible to achieve.

If the rounding is favourable and only just led to a student being awarded the higher grade, it means the lower most student has an imputed mark very close to the notional cut-score and vice versa.

The consequences are that such that, when the notional cut-scores are adjusted to realise the cohort-level predictions, those students close to the notional cut-scores are prioritised in being regraded.

An annotated version of this expression is provided below:

STAGE 9 – How do you calculate the overall standards?

The final stage of the process is to set the cut-scores to achieve an appropriate overall standard. This then determines students’ calculated grades.

This part of the process is similar to a key part of the grade boundary setting process in a typical year.

Usually, exam boards identify grade boundaries that would most closely represent a maintenance of statistical standards over time based on the cohort-level prediction.

In the case of this year, the mark data arises from the process described above rather than from the marking of student work.

As in a typical year, this part of the process is done using prior attainment matched students only to ensure, as far as is possible, a like-for-like comparison of cohorts.

The cut-scores are set at the imputed mark for the student in the cohort who most closely reflects the statistical prediction for the subject.

This year, to ensure a consistency of standard across exam boards, this process is performed nationally.

That means a single prior-attainment matched student mark distribution is formed across exam boards with the cut-score set against a single national prediction at each grade.