A recent study published in PNAS Nexus suggests that reading history summaries generated by artificial intelligence can subtly shift people’s social and political opinions. The research indicates that popular chatbots carry hidden biases that can influence users, even when the software provides factually accurate information in response to neutral questions. These findings provide evidence that relying on AI to learn about the world might quietly shape public attitudes.
Generative AI refers to computer programs that can create new text, images, or audio based on patterns they learned from vast amounts of data. Chatbots like ChatGPT are a common type of this technology. They are designed to mimic human conversation and answer questions. People increasingly use these tools as everyday search engines to learn about historical events and gather facts.
The scientists wanted to know if the way these chatbots write about history could sway how people think about modern issues. Previous studies focused on how artificial intelligence persuades people when it is specifically instructed to make an argument or spread misinformation. This new research focuses on more subtle form of influence.
“My collaborators and I had been following a lot of interesting research on AI-powered chatbots’ ability to persuade people during dynamic conversations, and we started wondering how AI-generated content could influence people in more routine, everyday settings,” said study author Daniel Karell, an assistant professor of sociology at Yale University.
“Namely, what happens when people simply develop the habit of querying a chatbot to learn things about the world? Once we had this question, we decided to focus on the case of using AI to learn about historical events since research has shown that people’s understanding of history profoundly influences their identities and worldviews.”
The scientists explored a concept called latent bias. This is an underlying slant that naturally develops in a computer program during its training process. It often occurs because the software absorbs the subtle opinions and language patterns present in the millions of internet pages it reads.
The researchers also tested prompting bias. This happens when a human user explicitly instructs the chatbot to adopt a specific political viewpoint. For example, a user might type a command asking the software to write a summary from a strictly conservative or liberal perspective.
To test these concepts, the researchers conducted an experiment with 1,912 participants. They selected a group of people that closely matched the demographic makeup of the United States population. Each participant was asked to read short summaries of two real historical events from the twentieth century.
One event was the Seattle General Strike of 1919, where tens of thousands of workers stopped working for several days. The second event involved university student protests in 1968 that demanded more academic representation for ethnic minorities. The scientists selected these specific events because they are not widely known to the average person today.
If participants already possessed strong, fixed opinions about an event, a short text would be less likely to change their minds. For example, the researchers suspected that well-known events like the September 11 attacks would not yield the same attitude shifts. Using obscure events allowed the researchers to measure the true persuasive power of the text.
Participants were randomly divided into different groups to read different versions of the history summaries. One group read the standard Wikipedia entries for the events. A second group read summaries generated by the chatbot GPT-4o using a basic, neutral request.
Two other groups read summaries generated by GPT-4o after the chatbot was instructed to write with either a liberal or a conservative slant. All the artificial intelligence summaries were kept completely factually accurate. The chatbot only changed the framing, tone, and emphasis of the historical facts.
After reading the texts, participants answered survey questions about their own social and political views related to the events. They were asked about their opinions on the appropriateness of labor strikes and the use of school curricula to advance social justice causes. Their answers were scored on a five-point scale.
On this specific scale, a score of one meant extremely conservative, while a score of five meant extremely liberal. A score of three represented a perfectly moderate viewpoint. The researchers then averaged the scores to determine the overall political leanings of the readers.
The scientists found that reading the different summaries had a measurable impact on participant attitudes. The default chatbot summaries led to more liberal opinions compared to the standard Wikipedia summaries. This suggests that the baseline model carries an underlying liberal bias that naturally surfaces even when a user asks a simple question.
The average opinions after reading the Wikipedia text fell near the middle, representing a moderate stance with a score of 3.47. The average opinions after reading the default chatbot summaries shifted slightly upward to 3.57. This movement indicates a slight shift from a moderate stance toward a somewhat liberal stance.
When the chatbot was specifically instructed to use a liberal framing, readers also shifted toward more liberal opinions. This shift happened across the board for all demographics. It impacted readers regardless of whether they originally identified as liberal, moderate, or conservative.
“I was a little surprised that we did actually find an effect on people’s attitudes,” Karell told PsyPost. “I thought there was a good chance that we were going to get null results because I did not think that simply reading a short AI-generated summary of an event would have much of an impact (relative to reading a Wikipedia summary of the event).”
“It’s likely that we did find an effect because we used historical events that were not very familiar to people. If we had selected well-known events, like, say, the September 11, 2001 attacks or the January 6 riot at the US Capitol, we probably would have obtained null results because many people already have well-formed, strongly-held attitudes about these events.”
The summaries generated with a conservative framing tended to make readers report more conservative opinions overall. But this shift primarily occurred among participants who already leaned conservative, rather than changing the minds of liberal or moderate readers.
This uneven reaction might be tied to how people process new information. People tend to filter new facts through their existing belief systems. Encountering information that confirms existing beliefs often reinforces those views, while encountering opposing views can provoke defensive reactions.
“People’s views on a historical event can be influenced by learning about this event from a popular commercial AI chatbot, even when a user prompts the chatbot with a basic, neutral query,” Karell said. “Furthermore, their attitudes can also be influenced by the chatbot when it has been instructed to take on a particular political bias, which is easy to do.”
“Overall, users of AI should be aware that the companies that develop AI tools — and the governments that might impose regulations on them — can imbue the chatbot with characteristics that subsequently influence users, even when they are using the AI tool in mundane, everyday ways.”
While the study provides evidence of persuasion, it is important to avoid overstating the scope of the findings. The experiment only tested summaries of two historical events. More research is needed to see if these patterns hold true for other historical periods and other subjects beyond history.
“It is important to keep in mind that the effect sizes are modest,” Karell explained. “The differences between the groups that read the AI and Wikipedia summaries was between, say, a moderate attitude and a ‘slightly liberal’ attitude. Nonetheless, it could be that the effect sizes accumulate and ultimately become more consequential over many uses of a chatbot, but further research will be needed to determine this.”
The degree of underlying bias likely varies between different computer models created by different companies. The way a model is trained and filtered before it reaches the public can change how it talks about history. Future studies might explore exactly how a model’s underlying slant interacts with specific user instructions.
“In the near future, many people will learn about history, as well as other things about our world, by simply asking a chatbot to tell them about it,” Karell said. “This puts knowledge and learning in the hands of the developers and regulators of AI tools, which I think will have significant implications for society. Understanding these implications is a big task that will require many research projects — and the continuing work of the excellent social scientists currently studying the social consequences of AI.”
“After we conducted the study (but before publication), xAI announced that it was going to use its AI chatbot, Grok, to create its own version of Wikipedia, called ‘Grokipedia,’” Karell added. “So, widespread, publicly available summaries of history (and other knowledge) generated by a private company’s AI tool are now a reality.”
The study, “How latent and prompting biases in AI-generated historical narratives influence opinions,” was authored by Matthew Shu, Daniel Karell, Keitaro Okura, and Thomas R Davidson.
Leave a comment
You must be logged in to post a comment.