A recent study published in the Journal of Experimental Psychology: General suggests that people consistently judge creative writing more harshly if they believe it was created by artificial intelligence. This bias appears incredibly difficult to overcome, pointing to a persistent human preference for art created by people.
Generative artificial intelligence refers to computer programs capable of producing new text, images, or music by predicting patterns from massive amounts of data. Tools like ChatGPT and Claude can now write essays, poems, and stories that read very much like they were written by a real person. As these technologies become more common, scientists wanted to understand how people react to computer-generated art.
“We started this project in early 2023, shortly after the launch of ChatGPT. From my early interactions with the technology, it was clear to me that this tool was capable of creative production, and I was very curious about whether and how humans would react to AI-produced creative goods,” explained study author Manav Raj, an assistant professor in management at the Wharton School of the University of Pennsylvania.
Prior research hints that people might not be able to tell the difference between human and computer writing if they are kept in the dark. However, the researchers conducted this specific study to see what happens when audiences are explicitly told that a machine wrote the text. They wanted to see if this knowledge changes how people enjoy the art and whether anything can soften that negative reaction.
To explore these questions, the scientists carried out sixteen separate experiments involving a total of 27,491 participants. In the first group of five experiments, researchers tested whether the actual content of the writing changed how people reacted to the artificial intelligence label. They had participants read poems and short stories generated by ChatGPT and rated them on quality, creativity, and enjoyment.
Some participants were told a machine wrote the text, while others were told a human wrote it. The researchers varied the writing style, testing first-person versus third-person perspectives, poetry versus prose, and different emotional tones. They even tested stories featuring human characters versus aliens, animals, and robots.
Across all these variations and thousands of participants, readers consistently gave lower ratings to the text when they thought a machine wrote it. Changing the story details did not consistently lessen this penalty. This initial phase provided evidence that the bias is largely independent of the specific content of the writing.
In the second phase of the research, the scientists conducted an experiment with 3,590 participants to see if the evaluation context mattered. They asked one group to judge the text as a piece of art. They asked another group to judge it based on objective qualities like coherence and logic.
Changing the instructions in this way did not soften the negative reaction. Participants in both groups still devalued the writing when they believed it came from a computer. This suggests that the bias applies whether people are reading for pleasure or for practical evaluation.
Next, the researchers ran five more experiments to see if changing people’s perceptions of the computer program would help. In these studies, they asked participants to read articles about the impressive cognitive or emotional capabilities of machines before reading the generated stories. In some versions, the scientists also tried humanizing the software by giving it a name and a gender.
None of these strategies reliably reduced the negative bias. Even when the computer program was described as highly capable or given human traits, participants still rated the writing lower upon learning its origin. The negative reaction proved remarkably persistent across these diverse approaches.
“The surprise to us was how persistent the effect was,” Raj told PsyPost. “We really tried at different points to “break” it and to find circumstances where we could get the AI disclosure discount to go away. Despite our attempts that built on existing literature on algorithmic aversion, we found this result was really sticky.”
In a fourth pair of experiments, the scientists explored whether knowing a computer wrote a story simply makes people feel ambivalent. Ambivalence means having mixed feelings, where someone might see both positive and negative qualities in the exact same thing at the exact same time. Testing 423 and 1,280 participants respectively across two studies, the researchers sought to measure this specific emotional state.
They found that knowing about the computer involvement did not create mixed feelings. It simply made the participants’ judgments more negative overall. The disclosure did not create a complex emotional response, but rather a straightforward decrease in appreciation.
Finally, the researchers ran three experiments to test a concept involving a human in the loop. They wanted to know if framing the writing process as a collaboration between a person and a machine would be viewed more favorably. They tested this with machine-generated stories and with actual award-winning short stories written by humans.
When participants were told a person used a computer program as a tool to write the story, they still judged the work just as harshly as if the machine had written it alone.
Throughout the studies, researchers collected data on various potential mechanisms, like perceived humanness, effort, and emotional depth. They consistently found that perceived authenticity was the strongest factor explaining the lowered ratings. People simply view machine-generated text as less authentic than human creations, which explains the negative ratings.
“Our main finding is that, at least at this point, humans have a persistent, negative reaction to knowing that creative goods (or at least creative writing) are produced with the help of AI,” Raj said. “While everything with AI is a moving target right now, this lasted over many, many studies and a roughly two-year period of data collection.”
While these findings provide evidence of a strong bias, there are a few potential limitations to keep in mind. The participants were recruited from an online platform that tends to attract people who are somewhat tech-savvy. This means the results might not perfectly represent the entire global population.
The observed biases could also manifest differently in visual arts, music, or other physical products. It is entirely possible that attitudes will shift as society becomes more accustomed to this technology. Future research could explore whether this negative bias fades over time as machine-generated text becomes an everyday reality.
“One thing I’d note is that our study does not speak to the quality of AI-generated creative goods at all,” Raj explained. “In all cases, we held the writing sample constant and just manipulated whether participants believed it was written by AI. Accordingly, the quality and nature of the creative goods are an open question.”
“This last point is a question that I’d be interested in studying future. While we are using AI for creative purposes and innovation, we do not yet know what it means for the characteristics of creative goods (other than some research that suggests we have a hard time telling apart AI-generated vs. human-generated creative goods in some settings). I’m very interested in pushing further in this domain.”
The study, “The Artificial Intelligence Disclosure Penalty: Humans Persistently Devalue AI-Generated Creative Writing,” was authored by Manav Raj, Justin M. Berg, Rob Seamans.
Leave a comment
You must be logged in to post a comment.