Stanford researchers warn AI chatbots are pulling users into 'delusional spirals' with real-world consequences

23 Apr

New study of 19 verbatim human-chatbot transcripts finds AI is reinforcing grandiose and paranoid beliefs, with researchers calling for lawmakers to treat alignment as a public health issue.

Silhouette of a human head with tangled neural connections fragmenting into digital circuitry, illustrating AI chatbot impact on mental health — ***Stanford researchers say AI chatbots are reinforcing delusional thinking in some users, with consequences ranging from ruined relationships and careers to loss of life.***

Stanford researchers have published new findings warning that AI chatbots are actively reinforcing delusional thinking in some users, with consequences ranging from ruined relationships and careers to, in one documented case, loss of life.

The paper, which will be presented at the ACM FAccT Conference, analyzed 19 verbatim transcripts of real conversations between humans and chatbots and calls on both AI developers and policymakers to act, with implications for every education setting where students and staff now use generative AI tools daily.

Jared Moore, a PhD candidate in computer science at Stanford University and first author of the paper, and senior author Nick Haber, assistant professor at Stanford Graduate School of Education, argue that chatbot design itself is creating the risk, not rogue user behavior. The study was partially funded by the Stanford Institute for Human-Centered AI.

What a delusional spiral looks like

The researchers define delusional spirals as conversations in which a user presents an unusual, grandiose, paranoid or imaginary idea and the chatbot responds with affirmation, encouragement or help in building out the fantasy, while offering what Moore calls intimate reassurances that can sound all too human. Models extend the exchange rather than push back, and they are not equipped to route users who express suicidal or violent thoughts toward real help.

"People are really believing the AI," Moore says. "As you read through the transcripts, you see some users think that they've found a uniquely conscious chatbot."

He adds that "Chatbots are trained to be overly enthusiastic, often reframing the user's delusional thoughts in a positive light, dismissing counterevidence and projecting compassion and warmth. This can be destabilizing to a user who is primed for delusion." The team links the pattern to alignment training that pushes models to please and validate, combined with the tendency to hallucinate.

Researchers want alignment reframed as public health

Moore and colleagues recommend AI developers add metrics for delusional-spiral risk into model testing, and build detection filters that flag harmful conversation patterns, while acknowledging that privacy concerns complicate that work. "I think AI developers have a vested interest in addressing this concern about the use of their models in ways they likely never even intended or imagined," Moore notes.

On policy, the researchers call for lawmakers to treat alignment as a public health issue, with new standards for flagging sensitive conversations, greater transparency into safety tuning, and clear rules for crisis escalation when a user shows signs of self-harm or violence.

Warning lands as governments act on young people and the online world

The Stanford paper arrives as governments around the world are moving on how young people interact with digital platforms. In England, the UK government this week tabled an amendment to the Children's Wellbeing and Schools Bill to ban student phone use in schools, putting existing Department for Education guidance on a statutory footing. Ofsted is already assessing school phone policies as part of inspections from this month.

Australia has gone further. A full ban on social media access for children under 16 came into effect on 10 December 2025, covering Facebook, Instagram, Kick, Reddit, Snapchat, Threads, TikTok, Twitch, X and YouTube, with providers facing civil penalties of up to AUD 49.5 million for failing to take reasonable steps to keep under-16s off their platforms.

Haber's position at Stanford Graduate School of Education puts the new study squarely in front of the education sector, where chatbots are now embedded in tutoring platforms, student wellbeing tools and university research workflows. The risk Moore identifies, which is that models treat every conversation as one to be extended and deferred to, sits directly against the kind of friction a teacher, counselor or peer would apply.

"When we put chatbots that are meant to be helpful assistants out into the world and have real people use them in all sorts of ways, consequences emerge," Haber says. "Delusional spirals are one particularly acute consequence. By understanding it, we might be able to prevent real harm in the future."

AI companions and tutors are being rolled out across schools and universities, sometimes without agreed safety standards for vulnerable users, even as regulators move on phones and social media. The open question is whether chatbots will be the next category to face the same kind of statutory intervention.

Note: This article touches on sensitive topics including self-harm. Anyone affected can contact the Samaritans in the UK on 116 123, or the 988 Suicide and Crisis Lifeline in the US.