Despite the rapid rise of generative AI in artistic and professional domains, new research suggests that artificial intelligence has not surpassed humans in creative thinking. After analyzing 17 experimental studies comparing human and AI-generated ideas, the researchers found no compelling evidence that AI consistently outperforms humans in generating original or useful ideas. While some individual studies suggested advantages for generative models, a broader statistical analysis indicated that these results may not hold up under closer scrutiny.
Generative AI refers to computer systems designed to produce content—such as text, images, music, or ideas—based on prompts provided by human users. Tools like ChatGPT rely on large language models trained on vast amounts of text data to simulate human-like responses. In recent years, these systems have been widely used to support creative tasks, from brainstorming product ideas to generating marketing slogans. Their speed and fluency have led many to believe that AI might soon rival—or even exceed—human creativity.
“What initially piqued our curiosity was the general debate within our fields about whether generative AI has now reached a level that outperforms human creativity on various parameters, including in idea generation,” explained study authors Alwin de Rooij (assistant professor at Tilburg University and associate professor at Avans University of Applied Sciences) and Michael Mose Biskjaer (associate Professor at Aarhus University).
“Concretely, we noticed some striking claims from a few early studies suggesting that AI had indeed surpassed human creativity. The contention, in those studies, is that when ideas generated by humans are compared to those generated by prompting GenAI for creative ideas, ideas by GenAI are more likely to be judged as creative. Although these findings attracted wide media attention, the evidence from this first wave of studies after ChatGPT’s release also seemed inconsistent if you consider the available studies. We felt a systematic evaluation was needed to explore whether AI had really surpassed humans in creative idea generation, given the impact of this debate. Hence, we conducted a meta-analysis.”
The researchers conducted a meta-analysis, a statistical method used to combine findings from multiple independent studies. Their focus was on a specific aspect of creativity: idea generation. This refers to the ability to produce multiple original and relevant solutions to a given problem. It is often assessed using tasks like the Alternative Uses Task, where participants are asked to think of unconventional uses for everyday objects like bricks or paperclips. Such tasks have long served as a proxy for measuring creative potential in psychological studies.
The analysis included 115 effect sizes drawn from 17 experiments published between January 2022 and January 2025. In each study, researchers compared the quality of ideas generated by humans with those produced by prompting various generative AI tools. Some studies used professional participants, such as designers or domain experts, while others relied on students or crowd-sourced workers. Similarly, the AI tools tested ranged from early models like GPT-3 to more recent systems like GPT-4 and Claude.
Ideas were evaluated using three main criteria: originality, usefulness, and overall creativity. Importantly, judges were not told whether a given idea came from a human or a machine. This blind review process was designed to reduce potential bias and isolate the perceived creativity of the ideas themselves.
When the researchers combined the data from all 17 studies, they found a small average effect suggesting AI-generated ideas might be rated slightly more creative than those produced by humans. However, the difference was not statistically significant.
In other words, the apparent advantage for AI could be due to random variation rather than a meaningful pattern. Even when the researchers looked specifically at originality or usefulness, they found no consistent trend favoring one over the other. Any observed differences were largely driven by a few studies reporting especially large effects.
“Previous studies were split,” de Rooij and Biskjaer told PsyPost. “Some studies concluded that AI outperformed humans, while others said the opposite. There were also many studies with null findings. By pooling the evidence, we found a more balanced picture: AI can, of course, generate ideas that people evaluate as creative, but it has not surpassed humans in any structural way. Our results caution against overinterpreting any single study.”
“In short, current evidence (based on our sample) does not support any claims suggesting that generative AI has now structurally surpassed humans in creative idea generation.”
“This is not to say that AI cannot enable creativity,” the researchers xplained. “Far from it. The creative potential of AI is undeniable. As is the case with all other tools, not least digital creativity support tools (CSTs) developed and studied in Human-Computer Interaction (HCI), this creative potential depends on the practice of the people using these tools. But prompting AI for creative ideas may not be the most efficient or fruitful strategy.”
“See, for instance, this paper that shows a range of ways in which AI can be used to enhance human creativity: https://research.tilburguniversity.edu/en/publications/how-artists-use-ai-as-a-responsive-material-for-art-creation.”
The researchers also examined whether certain AI models were more successful than others. Among the most frequently studied systems were GPT-3, GPT-3.5, and GPT-4—different versions of the language model powering ChatGPT. When analyzed separately, none of these models showed a statistically significant advantage over human participants. Even GPT-4, the most advanced model in the dataset, did not reliably outperform humans across tasks.
While these results may seem surprising given the growing capabilities of generative AI, the researchers caution against drawing overly broad conclusions. AI systems are trained on enormous datasets, often containing examples of the very tasks they are later asked to perform. As a result, they may be especially well-suited to standardized tasks that resemble their training data. But this can give the illusion of creativity without the depth or intentionality associated with human thought.
In contrast, human creativity is shaped by lived experience, personal values, emotional insight, and cultural context. When people generate ideas, they often draw on memories, social cues, and ethical considerations—elements that current AI systems do not truly possess. While AI may be able to simulate these processes through pattern recognition, it does not yet demonstrate the kind of situated, meaningful creativity that many scholars argue is fundamental to human innovation.
The researchers also highlight a broader theoretical concern. If AI models are simply producing the most statistically likely “creative” responses based on patterns in their training data, does that count as genuine creativity? Or is it more akin to a sophisticated form of imitation? This question remains unresolved, but it points to the importance of defining what we mean by creativity in the first place.
“One surprise was the amount of variation across studies,” de Rooij and Biskjaer said. “In some cases, generative AI initially seemed to generate significantly more original output. But those effects disappeared when we adjusted for influential values (sensitivity analysis). One explanation for our ‘no significant differences’ findings may be that, given the vast dataset that generative AI is trained on, when you then prompt it for a creative idea, it will generate a ‘most likely creative idea,’ i.e., ideas derived from what people have frequently marked as creative in the training dataset.
“Whenever something is frequently called creative by others, it may not be so creative after all. This may fundamentally limit the actual creative potential of generative AI models. Generally, the study confirmed our early intuitions that the ‘AI is more creative than humans’ narrative tends to be driven by a few striking findings rather than a consistent pattern.”
The study does have some limitations. Most of the included research focused on text-based language models, which may not capture the full range of generative AI tools, such as image or music generators. Also, many studies used narrow tasks that may not reflect the complexity of real-world creative challenges. For instance, few asked participants to co-create with AI or explore ideas over time—processes that may better capture the potential of AI-human collaboration.
“The meta-analysis is early and with 17 studies relatively small,” the researchers noted. “As such, it’s intended as providing a quick snapshot of the current state of generative AI, capturing works between 2022 and 2025. Who knows what future AI models will be capable of!”
Looking forward, the researchers suggest that future studies should explore how generative AI can be integrated into creative workflows rather than pitted against humans in one-off comparisons. Instead of asking whether AI is better or worse at creativity, it may be more productive to examine how it changes the way people think, work, and collaborate. In professional and artistic contexts, for example, generative tools might serve as partners or provocateurs, helping to spark new ideas or reframe existing problems.
“While AI can produce output that people judge as creative, human creativity is bound to context in different ways than AI is,” de Rooij and Biskjaer explained. “Human creativity is embedded in culture and lived experience, and it is enacted through human bodies of flesh and blood. Rather than asking whether AI has surpassed human creativity and perhaps worrying about the consequences, our studies suggest that such concerns might be unfounded. Rather, what we really need is more insight into how we can best use generative AI to enrich human creativity. And to this end, much more interdisciplinary research is needed.”
The study, “Has AI Surpassed Humans in Creative Idea Generation? A Meta-Analysis,” was authored by Alwin de Rooij and Michael Mose Biskjaer.