If AI Is Going to Improve Teaching, It Has to Start with the Evidence

Adam Sturdee
May 16
3 min read

There is no shortage of AI tools being marketed to schools right now. Lesson planners, marking assistants, behaviour trackers, feedback engines. The pitch decks promise transformation; the demos look impressive. But spend ten minutes looking under the bonnet and the same question keeps surfacing: what evidence are these products actually built on?

Too often, the honest answer is very little. The model is clever. The interface is polished. But the pedagogical assumptions underneath are either invisible or, when you press for them, surprisingly thin. “What works in classrooms” turns out to be whatever the engineering team happened to read on a blog last week.

That is a problem. Teaching has a deep evidence base. We know a great deal about what helps pupils learn, what makes feedback effective, what kinds of classroom talk push thinking, and what kinds shut it down. If AI products in education ignore that body of research, they will at best be irrelevant and at worst push teachers toward practices that look productive but are not.

The harder, slower path is to build from the evidence outward. That means the research is not a marketing afterthought layered onto a finished product. It is the architecture.

Two examples of what that looks like in practice.

Sara Hennessy and colleagues at Cambridge developed the Teacher Scheme for Educational Dialogue Analysis, T-SEDA, as a way for teachers to examine classroom talk against a coded framework grounded in years of dialogic research. It is rigorous, transparent, and built specifically to support reflection on what teachers and pupils actually say to one another. Any AI tool claiming to analyse classroom discourse needs to be measured against frameworks like this, not against whatever pattern the model invents.

Sugata Mitra’s work on Self-Organised Learning Environments points in a different direction but with the same lesson: the professional learner, like the young learner, develops most when they have genuine autonomy over the question they are exploring. AI tools that hand teachers a verdict and call it development are missing the point. The teacher has to remain the protagonist of their own learning.

The library is wider than this, of course. Rosenshine’s Principles of Instruction. Dylan Wiliam on formative assessment. Mary Budd Rowe’s research on wait time from 1986, which still gets cited because it still holds up. The EEF’s Teaching and Learning Toolkit. More recent work from Dora Demszky and colleagues on automated discourse feedback. The question is whether builders bother to read any of it.

This is why product development in education cannot be a closed loop between engineers and a marketing team. It needs teachers in the room from the first sketch onwards. Not as a customer panel at the end of the cycle, but as design partners. And it needs academic advisers who can keep the team honest about what the research actually says, where it is contested, and what claims a product cannot defensibly make.

That partnership is slower. It involves more disagreement. Features get shelved when the evidence does not support them. Wording gets rewritten when an academic adviser points out that a phrase implies more than the research can carry. But the product that comes out the other side has something most edtech does not have: a defensible answer to the question of why it works.

I am hugely optimistic about what AI can do for teaching. Coaching at scale, sustainable workload reduction, reflection that is regular rather than annual. These are real prizes. But they will only be won by tools that take the evidence seriously and treat teachers as professionals whose autonomy is worth protecting.

At Starlight we are trying to build that way.

Hennessy’s T-SEDA framework and Mitra’s work on learner autonomy are both shaping how we think about the next iteration of our coaching reports. We do not always get it right. But the standard we hold ourselves to is that any feature should be able to point to the research it stands on. If it cannot, it does not ship.

That feels like the bar the whole sector should be working to.

Adam Sturdee is a senior leader and co-founder of Starlight, the UK’s teacher-first AI-powered transcript-based coaching platform for educators.

His work sits at the intersection of dialogic practice, instructional leadership and responsible AI strategy for schools and trusts.

He will be presenting his research on AI-supported coaching at the BERA TEAN Conference 2026: https://www.bera.ac.uk/conference/bera-tean-conference-2026

If AI Is Going to Improve Teaching, It Has to Start with the Evidence

Recent Posts

Comments