https://ofqual.blog.gov.uk/2026/01/14/using-ai-in-marking-why-technical-capability-fairness-and-transparency-all-matter/

Using AI in marking: why technical capability, fairness, and transparency all matter

Posted by: and , Posted on: - Categories: A levels and GCSEs, Artificial intelligence, Exams, Marking, Results

Ofqual has published new research exploring artificial intelligence’s potential role in marking, as well as its current capabilities and constraints. Rather than providing definitive answers, the published working paper offers an overview of the topic, shares our current thinking, highlights key challenges, and aims to stimulate further debate across the sector. This blog offers a taster of the questions we’re grappling with as we consider the future of AI in marking.

The working paper draws on a series of discussions with technology experts, awarding organisations and academics, who helped us to test and challenge our thinking about the potential applications and limitations of AI in marking.

These conversations have informed our current perspective and highlighted the complex questions that still need to be addressed.

Ofqual's current position on using AI in marking

Our conclusion is clear: AI is promising for quality assurance and marker training, but for the moment it’s nowhere near ready to take over high stakes marking. Our position remains that the use of AI as the sole mechanism for awarding marks does not comply with our current regulations.

Such use does not meet our requirements for a human based judgement to be used in marking decisions.

In addition, the potential for bias, inaccuracies and a lack of transparency in how marks are awarded could introduce unfairness into the system. This would be unacceptable in the marking process.

Student expectations and trust

High stakes qualifications represent a social contract between students, educators and society. Students invest years of effort into their education with legitimate expectations about how their work will be assessed, and how their qualifications will be valued and accepted by others. Currently, those expectations include having their exams marked by trained human examiners who understand both the subject matter and the assessment criteria, and who apply their academic judgement fairly and consistently.

Qualifications only hold value when students, parents, teachers and employers have confidence in their fairness, transparency and legitimacy. Any change to marking, particularly involving automation, must be carefully considered for its impact on this fundamental trust.

The challenges posed by AI in marking

Current AI technologies, especially large language models and deep neural networks, have advanced capabilities but lack true semantic understanding and the capacity for human-like judgement. These systems often function as ‘black boxes’, making it difficult even for experts to explain how specific marks are generated. This lack of transparency is a major challenge for maintaining trust in qualifications.

We must be able to understand how marks are determined. If explanations and evidence for why certain marks were offered are unreliable, or not even available, the fundamental contract of assessment breaks down.

It is a basic principle that underpins public trust and perceptions of fairness that students can challenge exam marking and secure a review. However, when marks are generated by AI, explanations and evidence for why certain marks were awarded may be unreliable, unavailable, or difficult to interpret, making it harder to uphold the transparency and accountability essential for trust in qualifications. There is much more work to do to understand and agree what this would mean where exams have been marked by AI rather than by a human marker. We explore these issues in more detail in our research paper.

Additionally, AI systems can perpetuate or amplify biases present in their training data, and empirical studies have shown that performance can vary across different demographic groups. Ensuring that criteria are applied consistently and fairly, regardless of candidate or marker identity, is a core principle of valid assessment.

Transparency and fairness are not just ethical ideals; they are essential for upholding the integrity and public confidence in qualifications.

The case for using AI in marking

There are promising applications for AI in marking, particularly in quality assurance of human marking and in training new markers. These uses could potentially improve marking consistency while keeping human expertise and judgement at the centre of the process.

But the vital caveats we have outlined remain. In addition, the performance of AI in marking varies significantly across different contexts. For example, there are substantial differences in what is needed to mark a maths exam compared with an English exam.

Current evidence does not support an overall or general case for using AI in marking. There needs to be context-specific evidence to make the case for using such tools.

The future of AI in marking

Our priorities are clear: ensuring fairness for students, protecting security, maintaining the validity of qualifications, and preserving public confidence, while enabling responsible innovation.

As AI technology develops, we will continue gathering evidence and fostering constructive discussion about its role in assessment.

The trust that everyone in education places in the fairness and transparency of qualifications remains at the heart of our regulatory approach.

Jo Handford, Associate Director Strategic Projects and Innovation, Ofqual

Joanna Williamson, Research Fellow, Ofqual

Sharing and comments

Share this page