Stanford report finds limited evidence behind AI impact in K-12 classrooms
New analysis from Stanford’s SCALE initiative highlights rapid growth in AI research, but finds only a small number of rigorous studies guiding real-world education decisions.
New analysis from Stanford’s SCALE initiative highlights rapid growth in AI research, but finds only a small number of rigorous studies guiding real-world education decisions.
Stanford Graduate School of Education’s SCALE initiative has published a new review of artificial intelligence in K-12 education, finding that while AI tools are being adopted quickly across classrooms, there is still limited high-quality evidence on their actual impact on teaching and learning.
The report, The Evidence Base on AI in K-12: A 2026 Review, analyzes more than 800 academic studies, now expanded to over 1,100 papers in its research repository, but identifies just 20 causal studies that rigorously measure how AI tools affect outcomes for students or educators. The findings highlight a widening gap between adoption and evidence, as schools integrate AI into daily practice without a strong research foundation.
Rapid growth in research, but limited causal evidence
Interest in AI in education has increased sharply, with the report showing a steep rise in published research in recent months. However, most studies focus on technical development, model performance, or observational analysis rather than measurable impact.
Only a small subset of studies meet the threshold for causal evidence, meaning they can demonstrate whether AI tools directly improve or change learning outcomes. This creates a challenge for education leaders making decisions about procurement, policy, and classroom use.
The report positions this gap as a structural issue in the current EdTech landscape. Tools are being deployed at scale, while the research needed to validate their effectiveness is still emerging.
The analysis shows that most research focuses on students as users of AI tools, with significantly less attention given to how teachers or school leaders use AI in practice.
Among the causal studies identified, a large proportion concentrate on math-related outcomes, with fewer studies examining literacy, broader academic subjects, or social-emotional development.
There is also limited research on how AI impacts different groups of students, including questions around equity, access, and long-term outcomes. The report highlights that most studies measure short-term performance rather than sustained learning.
Early evidence shows mixed outcomes for learning
Across the 20 causal studies reviewed, early findings suggest that AI tools can improve student performance during tasks such as math exercises, writing, and programming when the tools are actively used.
However, results are mixed when students complete assessments without AI support. In some cases, performance improves, while in others it remains unchanged or declines.
This raises a central question for educators: whether AI tools are supporting skill development or simply enabling task completion.
The report also identifies differences in tool design as a key factor. AI systems that guide reasoning, provide hints, or scaffold learning show more consistent positive outcomes than general-purpose tools that generate direct answers.
Learning science frameworks referenced in the report, including cognitive load theory, the zone of proximal development, and metacognition, suggest that tools which reduce productive struggle may limit long-term learning gains.
Potential efficiency gains for teachers
While student-focused research dominates, the report highlights early evidence that AI tools can support educators in specific areas.
Causal studies suggest that teacher-facing AI tools may reduce time spent on lesson preparation, support real-time instructional decisions, and provide insights into student progress without reducing instructional quality.
Some studies indicate that AI can help teachers ask more targeted questions and adjust their teaching based on automated feedback. However, as with student use, the evidence base remains limited.
Key gaps remain as adoption accelerates
The report identifies several areas where research is still lacking.
There are no high-quality causal studies of student AI use conducted in U.S. K-12 classrooms. Most existing studies are short-term and do not examine long-term learning, transfer of knowledge, or broader impacts such as student wellbeing or social development.
However, AI tools are continuing to enter classrooms at pace, with teachers and students already using them both inside and outside formal learning environments.
Implications for policy and EdTech decision-making
The findings underline the pressure on education leaders to make decisions about AI without a strong evidence base.
The report calls for a more evidence-led approach to policy, procurement, and classroom implementation, suggesting that current discussions are often driven by tool availability or future predictions rather than measured impact.
By focusing on causal studies, the review provides a baseline for what is currently known and highlights where further research is needed.
For EdTech providers and education systems, the direction is clear. As AI becomes embedded in teaching and learning, demonstrating measurable impact is likely to become a more central requirement, particularly as schools look to justify investment and assess long-term outcomes.
ETIH Innovation Awards 2026
The ETIH Innovation Awards 2026 are now open and recognize education technology organizations delivering measurable impact across K–12, higher education, and lifelong learning. The awards are open to entries from the UK, the Americas, and internationally, with submissions assessed on evidence of outcomes and real-world application.