How Helpful Are Large Language Models to Professional Writers? A New Study Explains

NYU Center for Data Science
4 min readApr 23, 2024

Large language models are becoming increasingly adept at generating passably functional human-like text. But are they useful for more creative tasks? A new study co-led by CDS PhD student Vishakh Padmakumar records the activities of actual creative writers to explore how AI tools might be able to support the creative writing process.

The study, titled “Creativity Support in the Age of Large Language Models: An Empirical Study Involving Emerging Writers,” was recently accepted to Creativity & Cognition 2024. Padmakumar collaborated with Tuhin Chakrabarty from Columbia University, Faeze Brahman from the Allen Institute for Artificial Intelligence, and Smaranda Muresan, also from Columbia University.

To gain insights into how professional writers interact with large language models, the researchers recruited 17 participants from MFA creative writing programs across the United States. “We reached out to MFA creative writing students via mailing lists of universities,” Padmakumar said. Some of them are already published, Padmakumar added, and at least one has been nominated for the Pushcart Prize, a prestigious small-press literary award.

The study’s design was grounded in the cognitive process model of writing, a theory from the 1980s by Linda Flower and John R. Hayes that views writing as a goal-oriented thinking process encompassing non-linear cognitive activities: planning, translating, and reviewing. Here “planning” refers to “setting goals, brainstorming ideas, and organizing the writers’ thoughts.” “Translation” is “the process of verbalizing ideas and thoughts.” Reviewing refers to “evaluating and revising what has been written by receiving feedback.”

Participants were provided with a collaborative writing interface that allowed them to draft their stories, generate plot ideas, and chat with the AI model directly. Inspired by the theory, the interface was designed by incorporating suggestions from the writers themselves and allows writers to obtain model help in all three cognitive activities in the writing process. The model used was GPT-3.5, which is currently what powers the free version of ChatGPT.

Padmakumar and his team found that while writers sought the AI’s help across all three types of cognitive activities, they found the models most helpful in translation and reviewing tasks. “People find the model pretty helpful for low-level translation, which means that if I tell the model ‘rewrite this paragraph, from the point of view of the bartender, instead of the main character,’ models can execute this local edit well because they have the context of the story,” Padmakumar said.

Additionally, the authors appreciated the model’s ability to provide instant feedback on their drafts, as seeking feedback from human editors or their friends can be expensive and time-consuming.

However, the study also highlighted some weaknesses in the AI’s capabilities, particularly in high-level planning and figurative language. “When asked for plot ideas, they found models often resort to clichés — they try to write the ‘happily ever after’ type of story,” Padmakumar noted. “And similarly, they aren’t very good right now at using figurative language. As one of our participants noted, they reverse the maxim of ‘show, don’t tell’, often expressing things literally that are better conveyed through subtlety and nuance.”

Tracking the writers’ engagement in the three writing activities gave insight into how writers’ actions evolve throughout the process of creation. The researchers found that writers engaged in planning and feedback more in the beginning, and translation more as a project neared completion. This chronological representation of one writer was typical (“Inheritance of Shadows” is the name of the story whose creation is being tracked):

Incidentally, this activity-tracking appeared to affirm Flower and Hayes’ postulation that writers work nonlinearly, moving fluidly between these creative acts.

The study’s findings, along with the data collected from the writer-AI interactions, are now publicly available on the project’s website, which includes all the stories created using the LLM-assisted process, and an interactive feature that shows exactly how the LLM was used. This open-access approach allows other researchers to build upon the resource and formulate new hypotheses on how AI models can better support creative writing.

Looking ahead, Padmakumar sees the potential for these AI tools to enhance the creative process, rather than replace human writers. “I don’t actually think the way we train the models right now will ever replace the best writers,” he said, “but I think it could be a useful tool for speeding up some of their processes.”

By Stephen Thomas

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.