Given permission to use AI, most college students show surprising restraint in their final essays

When given permission and guidance to use artificial intelligence tools in college writing classes, students largely rely on the software for brainstorming and research rather than having it write essays for them wholesale. These findings, published in the Journal of Writing Research, suggest that students employ computerized text generators selectively to augment their learning process. The study also revealed unexpected differences in how non-native English speakers use the technology compared to their peers.

The public release of ChatGPT in late 2022 generated intense debate regarding its place in higher education. Many educators worried that these tools would erode original thinking and academic integrity. On the opposing side, instructors argued that generative artificial intelligence could act as a personalized tutor, helping to outline ideas and support language learners. According to researchers, these debates often frame the technology as a simple binary, assuming a paper is either entirely human or entirely machine.

Because of this divide, instructors needed granular data about how students actively interact with these programs behind the scenes. Sarah Madsen Hardy, a writing program instructor at Boston University, led a research team to investigate the issue. She and her colleagues wanted to observe how students prompt artificial intelligence systems and what types of machine-generated text they ultimately choose to include in their final academic papers. They sought to create an environment where students felt comfortable exploring these new technologies without fear of punishment.

The researchers designed an observational study involving students enrolled in introductory writing and research courses at Boston University. The course sections were specifically set up as pilots to experiment with artificial intelligence integration. All students in these pilot sections received subscriptions to ChatGPT Plus. Instructors guided the students through exercises focused on basic technical understanding and the ethical implications of the software, such as identifying biases and verifying digital sources.

Instead of banning the tools, instructors allowed students to submit work containing up to 50 percent machine-generated text. To track how the software was used, instructors required students to highlight any word-for-word machine-authored text in a blue font on their submitted assignments. Students were not asked to highlight minor grammatical adjustments, only substantive language generated directly by the chat interface.

After the semester concluded, Madsen Hardy and her team recruited 50 of these students to share their essays. A subset of 34 participants also provided the chat logs detailing their interactions with ChatGPT during the drafting process. About a quarter of the study participants identified themselves as studying English as a foreign language. The research team specifically wanted to record whether these international students utilized the text generator differently than native English speakers.

The team stripped the collected essays and chat logs of identifying information and performed a thematic content analysis. This process involved sorting the students’ text into specific functional categories, such as brainstorming, revision, or direct writing. To increase efficiency, the researchers utilized a large language model as an additional rater alongside human experts. They measured the agreement between the human raters and the computerized system, finding a high level of consistency in how the chat logs were categorized.

In analyzing those chat logs, the research team found that a minority of the students’ requests involved asking the program to generate original text for their assignments. Only 18.6 percent of the 290 analyzed prompts asked ChatGPT to write words from scratch. The vast majority of the students’ interactions involved background work leading up to the writing stage.

Students most frequently used the chat function to ask for help with revision, such as making sentences shorter or altering the tone. This accounted for about a quarter of the total prompts. Another highly common use was asking the program to explain course materials, define concepts, or clarify academic readings. When researchers grouped the prompts, they noticed that students asked the software to give them advice, resources, or explanations far more often than they asked it to produce text.

The chat logs also revealed a timeline of how students engaged with the tool as their assignments progressed. Most students began their interactions by asking the artificial intelligence for help with planning and locating sources. Prompts asking the machine to produce and compose writing usually occurred in the final quarter of the chat session. This indicates that direct text generation only happened after a long conversation tackling traditional phases of the drafting process.

When the team looked at the actual submitted papers, the data showed high levels of restraint among the writers. More than half of the students who participated in the pilot program chose not to include any verbatim machine-generated text in their final drafts. Across all 50 analyzed papers, only 8.2 percent of the total submitted words were flagged in blue to indicate artificial intelligence authorship. This usage fell well below the generous half-allowance permitted by the instructors.

When students did choose to paste text directly from ChatGPT into their papers, they rarely dropped in entire block paragraphs. Only about six percent of the blue text consisted of wholesale paragraph chunks. Instead, students mostly wove small, machine-generated phrases into their own original writing. The most common rhetorical purpose for incorporating this generated text was to help with discussion, analysis, and synthesis of ideas.

The data revealed striking contrasts between students studying English as a foreign language and native English speakers in the class. The authors expected that non-native speakers might heavily rely on the text generator to clarify confusing academic readings. The analysis showed the exact opposite pattern. Prompts seeking understanding or clarification were the least common type of request submitted by the foreign language students.

Native English speakers asked the text generator to clarify concepts at a rate seven times higher than their international peers. Instead, the foreign language students were roughly twice as likely to use the chat interface to ask for help with revising their existing prose. They were also more likely to begin a chat session by asking for direct feedback on their writing.

These interaction differences carried over into the final essays. The foreign language students actually integrated fewer machine-generated passages into their final submissions than the native speakers did. Additionally, native speakers frequently used the program to generate summaries of academic sources to include in their papers. The international students almost never used the tool for summarization, choosing instead to handle those tasks without direct machine assistance.

The researchers noted that these patterns challenge assumptions that non-native speakers heavily depend on writing software to complete assignments. Because taking classes in a secondary language imposes an extra cognitive load, these writers used the tools mainly to polish their sentence structures. The study suggests that international students utilize artificial intelligence resources strategically to meet specific linguistic needs, rather than leaning on it for basic comprehension.

The authors emphasized that giving students a structured environment to explore the tool helped them make active decisions. By treating the software as a collaborative assistant, students retained ownership over their assignments. “Our findings show that students who are offered sustained instruction are capable of engaging GAI robustly and adopting its outputs selectively, in ways that suggest both judgement about its effects and investment in their own learning,” the authors wrote.

The researchers outlined a few notes of caution regarding their interpretation of the data. Because participation in the post-semester data collection was optional, the 50 students who volunteered their assignments might not represent the typical college freshman. Students who struggled in the course or relied heavily on automated writing for illicit reasons might have decided not to share their data with the researchers.

The requirement for students to self-report their generated text using blue font presents another limitation. Writers might have been inconsistent with their highlighting, or they may have forgotten to label sentences inspired by the chat interface. Still, instructors held regular meetings with students to review drafts in progress, giving teachers confidence that the reported data reflected reality.

Future research will need to include larger participant groups to confirm if the observed language usage patterns hold up on a wider scale. Researchers could pair this content analysis with student interviews. Speaking with writers directly might illuminate exactly why non-native speakers hesitate to use chatbots for reading comprehension.

Until then, the current data offers an optimistic snapshot of modern college classrooms. When instructors model safe and effective ways to interact with text generators, students appear to integrate the technology as a helpful assistant rather than a replacement for their own academic effort.

The study, “Generative AI Use in College Writing Classes: An Analysis of Student Chat Logs and Writing Projects,” was authored by Sarah Madsen Hardy, Pary Fassihi, Shuang Geng, Christopher McVey, and Matt Parfitt.

Leave a comment
Stay up to date
Register now to get updates on promotions and coupons
HTML Snippets Powered By : XYZScripts.com

Shopping cart

×