Eedi and Google DeepMind begin second AI tutoring trial across 1,525 UK students

AIResearchSchools

5 May

The four-arm randomized controlled trial tests whether student-level context improves the effectiveness of a constrained AI tutor, building on a 2025 study that Stanford's SCALE Initiative recognized as one of only 20 high-quality causal studies from more than 800 papers reviewed.

UK secondary school students using tablets with AI tutoring technology in classroom setting — ***Eedi and Google DeepMind are running a 12-week AI tutoring trial with 1,525 students across ten UK secondary schools***

UK EdTech company Eedi and Google DeepMind have begun their second randomized controlled trial of a constrained AI tutor, scaling up from 165 students in their 2025 study to 1,525 students across ten UK secondary schools.

The 12-week trial, running with students in years eight, nine, and ten, will measure learning outcomes using Renaissance's STAR Maths assessment.

Bibi Groot, Chief Impact Officer at Eedi, shared details of the trial on LinkedIn, writing: "There's something quietly thrilling about pressing GO on a trial we've spent months designing."

James Stalley-Moores, CTO at Eedi, also posted on LinkedIn, confirming the trial is now underway and describing it as "a much larger rigorous study."

What makes the AI tutor constrained

The AI tutor used in the trial is not a general-purpose chatbot. It activates only when a student answers a diagnostic question incorrectly, and the conversation is limited to the specific misconception the student holds. Each incorrect answer in Eedi's question library is mapped to a named misconception through a diagnostic engine the company has built and refined over nearly a decade of classroom use.

The design responds directly to growing evidence about the risks of unconstrained AI in learning. Research by Bastani and colleagues in 2025 found that students using an unconstrained AI tutor performed well while using the tool but significantly worse on post-tests without it, a pattern attributed to cognitive offloading.

Four conditions, one central question

The trial compares four conditions: static content including fluency practice and pre-recorded explainer videos, an AI tutor with a detailed pedagogy prompt and access to diagnostic data, that same AI tutor enriched with student-level personalization signals, and a human tutor working without AI support. In every AI condition, a human tutor reviews, edits, or rejects every message before it reaches a student.

The central question is whether layering student-level context into the AI tutor's prompts meaningfully changes learning outcomes compared to pedagogical prompting alone.

Building on a study Stanford recognized

The first Eedi-DeepMind study in 2025 was deliberately small. Supervising tutors approved 74.4 percent of AI-drafted messages without edits, the safety audit found zero instances of harmful content, and Bayesian analysis attributed a 93.6 percent posterior probability to supervised AI tutoring producing greater knowledge transfer than human tutoring alone. Stanford's SCALE Initiative included the study in its 2026 review of AI in K-12 as one of only 20 high-quality causal studies identified from more than 800 papers reviewed.

Eedi has described those findings as "signposts, not conclusions." Results from the second trial are expected in summer 2026, with the company positioning the work as a deliberate, evidence-first alternative to the rapid scaling underway elsewhere in the AI tutoring market.