Highly cited meta-analysis claiming ChatGPT boosts student learning retracted over data concerns
A Nature paper that attracted nearly half a million views and hundreds of citations has been pulled after editors found discrepancies in the underlying analysis.
A highly cited meta-analysis claiming ChatGPT improves student learning has been retracted by Springer Nature after editors identified discrepancies in the data
A widely cited meta-analysis that claimed OpenAI’s ChatGPT has a large positive effect on student learning performance has been retracted by Humanities and Social Sciences Communications, the Springer Nature journal that published it just under a year ago.
The paper, authored by Jin Wang and Wenxiang Fan of Hangzhou Normal University's Jing Hengyi School of Education, reviewed 51 experimental and quasi-experimental studies published between November 2022 and February 2025. It reported that ChatGPT had a large positive impact on learning performance (g = 0.867) and moderately positive effects on both learning perception (g = 0.456) and higher-order thinking (g = 0.457).
Originally published on May 6, 2025, the paper accumulated approximately 486,000 views, 266 citations, and an Altmetric score of 1,023 before being retracted on April 22, 2026.
Why the paper was retracted
The journal's retraction notice states that the editor identified "discrepancies in the meta-analysis" that "ultimately undermine the confidence the Editor can place in the validity of the analysis and resulting conclusions." The notice also confirms that the authors did not respond to correspondence about the retraction.
The specific nature of the discrepancies has not been publicly detailed beyond the editor's statement, leaving questions about whether the issues relate to data extraction, effect size calculations, study selection, or some combination of these.
What the paper originally claimed
The meta-analysis had drawn attention for its broad scope and specific practical recommendations. It suggested that ChatGPT should be used continuously for four to eight weeks for optimal effects, that it was most effective in problem-based learning environments, and that its impact on learning performance was strongest in skills and competencies development courses.
The paper also reported moderating effects across course type, learning model, and intervention duration. It concluded that ChatGPT should be "flexibly integrated into teaching as an intelligent tutor, learning partner, and educational tool."
These findings were picked up across education and AI research circles, contributing to a growing body of literature being used to inform institutional decisions about generative AI adoption in classrooms.
What this means for the evidence base
The retraction removes one of the most visible quantitative studies supporting ChatGPT's effectiveness in education from the credible literature. With 266 citations already logged, the paper's claims are embedded in subsequent research, raising questions about how downstream studies that relied on its findings will address the retraction.
The episode also highlights a recurring tension in AI-in-education research: the speed at which studies are produced and cited can outpace the scrutiny needed to verify their methods. Nine of the 51 studies included in the meta-analysis examined higher-order thinking, and only 19 looked at learning perception, sample sizes the authors themselves acknowledged as limited.