Generative artificial intelligence chatbots may be worsening the symptoms of some people who experience severe mental health conditions. A recent analysis of thousands of patient health records reveals instances where interacting with these programs reinforced false beliefs, encouraged self-harm, or exacerbated eating disorders. The findings were recently published in the journal Acta Psychiatrica Scandinavica.
The research was led by Søren Dinesen Østergaard, a professor and psychiatrist heading a research unit at Aarhus University Hospital in Denmark. He collaborated with Sidse Godske Olsen and Christian Jon Reinecke-Tellefsen, who are both researchers affiliated with Aarhus University and the hospital’s psychiatry department. They wanted to investigate how modern digital tools affect vulnerable populations in a real-world clinical setting.
Artificial intelligence programs like ChatGPT rely on underlying structures called large language models. These are massive mathematical systems trained on vast amounts of text from the internet. When a person types a prompt, the system functions as a giant statistical engine, calculating the most likely next word to generate a highly fluent response.
Because these conversations feel so natural, many people begin to treat the software as if it were a conscious being. This can lead to a false sense of trust, especially because the models are often programmed to be highly agreeable. To keep users engaged, the software rewards responses that make the human happy, which sometimes means echoing a user’s worldview back to them.
In a psychiatric context, a delusion is defined as a fixed, false belief that a person maintains despite clear evidence to the contrary. If someone believes they are a divine messenger or that they are being secretly monitored, an agreeable chatbot might validate that idea rather than challenge it. Østergaard previously warned that this dynamic could trigger or worsen psychosis in predisposed individuals, acting as a powerful tool to confirm false beliefs.
Prior to this recent study, most evidence of this phenomenon came from media reports and online forums. News articles described situations where individuals engaged in marathon chat sessions, developing religious infatuations or grand conspiracies with their digital companions. One widely reported case involved a man who spent up to sixteen hours a day talking to a program, eventually coming to believe he was living in a simulated reality.
The researchers wanted to move beyond anecdotal news stories to see if this problem was appearing inside a professional medical environment. They set out to search the clinical notes of an entire regional healthcare system. They reviewed records from the Psychiatric Services of the Central Denmark Region, which provides inpatient, outpatient, and emergency care to roughly 1.4 million residents.
The team gathered electronic health records from nearly 54,000 unique patients treated between September 2022 and June 2025. During this nearly three-year window, medical staff entered more than ten million clinical notes into the regional database. The researchers then scanned this massive collection of text for any mentions of artificial intelligence programs.
To ensure they captured as many cases as possible, they searched for the words “chatbot” and “ChatGPT” alongside twenty common misspellings. They actually generated this list of misspellings by asking ChatGPT itself how human users most often type its name incorrectly. This allowed them to catch typos like “chatboot,” “ChatGBT,” and “ChatJPT” in the doctors’ notes.
This initial sweep surfaced 181 specific medical notes belonging to 126 unique patients. Olsen and Reinecke-Tellefsen then independently read through these notes to understand the context of the patients’ technology habits. They looked for indications that the digital interactions contributed to psychological symptoms.
When the two researchers disagreed on how to interpret a note, they discussed the case until they reached a consensus. Out of the initial group, they identified 38 patients who experienced potentially harmful consequences related to their use of the digital tools. The most common negative outcome was a worsening of delusions, which occurred in eleven of the cases.
“AI chatbots have an inherent tendency to validate the user’s beliefs,” Østergaard says in a press release. “It is obvious that this is highly problematic if a user already has a delusion or is in the process of developing one.”
Other patients experienced completely different types of harm. The team found six instances where the programs appeared to worsen suicidal thoughts or where patients asked the software about methods for self-injury. Another five patients used the bots to fixate on calorie counting, which worsened their existing eating disorders.
The researchers also found a handful of cases where the software appeared to aggravate other psychiatric conditions. A few patients experienced worsened episodes of mania, which are periods of extreme, sometimes reckless energy and elevated mood. Others used the bots compulsively in an attempt to relieve the intrusive thoughts associated with obsessive-compulsive disorder.
“Despite our knowledge in this area still being limited, I would argue that we now know enough to say that use of AI chatbots is risky if you have a severe mental illness,” Østergaard says. “I would urge caution here.”
The team also documented examples of patients using the technology in ways that seemed constructive or helpful. Out of the group, 32 patients used the tools to learn about their symptoms, seek informal talk therapy, or find companionship when they felt lonely. Another twenty patients used the programs to help organize practical tasks in their daily lives.
However, the researchers point out that technology companies did not design their products for medical or therapeutic use. It remains entirely unclear who is legally liable if a digital program gives a vulnerable patient dangerous advice. “I am fundamentally skeptical about replacing a trained psychotherapist with an AI chatbot,” Østergaard notes.
The authors acknowledge several limitations in their data collection and analysis. Most importantly, reading doctors’ notes does not definitively prove that the technology caused the patients’ symptoms. It is entirely possible the patients’ conditions would have worsened even without interacting with a computer program.
Additionally, doctors in this hospital system do not routinely ask every patient about their technology habits. The instances they found were simply the cases where a patient happened to mention the software during an appointment. “In our study, we are only seeing the tip of the iceberg, as we have only been able to identify cases that were described in the electronic health records,” Østergaard explains.
Because of this incomplete tracking, the exact rate at which these tools harm psychiatric patients remains completely unknown. The researchers also relied on a narrow list of search terms focused mostly on one specific brand. This means they likely missed patients who used other artificial intelligence platforms.
Other mental health professionals have echoed these concerns regarding how technology companies design their software. The business models of many digital platforms rely on maximizing the amount of time people spend looking at the screen. The software is not thinking about long-term well-being, but rather how to keep a person as engaged as possible right now.
Despite his concerns, Østergaard is actually exploring ways machine learning can aid the medical field. His research group recently trained an algorithm to analyze electronic health records and predict which patients might develop specific psychiatric disorders in the future. The same underlying technology that worsens delusions in isolated users might eventually help doctors accelerate care in a clinical setting.
Moving forward, Østergaard advocates for a much broader approach to studying this emerging problem. He suggests conducting detailed qualitative interviews with affected patients to better understand their daily digital interactions. He also recommends setting up controlled experiments to see how different programming behaviors alter a patient’s psychiatric symptoms.
The researchers suggest that developers should build automatic safety features into their software. These features could detect indications of psychosis and redirect the conversation toward professional mental health resources. They argue that technology companies should not be the only ones deciding if their products are safe for public use.
“Currently, it is left to the companies themselves to decide whether their products are safe enough for users,” Østergaard says. “Regulation is needed at a central level.” In the meantime, the researchers hope mental health professionals will actively ask their patients about their digital habits and offer guidance on safe usage.
The study, “Potentially Harmful Consequences of Artificial Intelligence (AI) Chatbot Use Among Patients With Mental Illness: Early Data From a Large Psychiatric Service System,” was authored by Sidse Godske Olsen, Christian Jon Reinecke-Tellefsen, and Søren Dinesen Østergaard.
Leave a comment
You must be logged in to post a comment.