A newly published report from the University of Pittsburgh that claims “AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably,” has sent a swarm of headlines buzzing around the maligned and beloved art form. The Washington Post definitively declared, “ChatGPT is a poet” atop its article about the report. Others more closely echoed the age-old claims that poetry is dead, or as one publication offered, “Suck it, Shakespeare.”
But reports like these are important to investigate as they have practical and potentially serious consequences. They can reinforce what some argue is the art form’s irrelevance, which in turn helps fuel arguments that discredit the importance of teaching poetry and supporting today’s thousands of passionate and prolific poets, as well as the hundreds of nonprofit poetry organizations and publishers dedicated to their work.
Poets use well-worn human material—agreed upon symbols for sounds and ideas, a.k.a language—to make something new. How different is that from generative large language models (LLMs) algorithmically generating new texts from the millions of pages of writing (or inputs) with which computer programmers have built them? Very.
It’s worth remembering that LLMs’ origin story involves consuming vast quantities of authors’ work without seeking consent or offering remuneration. As web developer Alex Reisner explained in an Atlantic article, “A culture of piracy has existed since the early days of the internet, and in a sense, AI developers are doing something that’s come to seem natural. It is uncomfortably apt that today’s flagship technology is powered by mass theft.”
In this case, the authors of the new report, scientists Brian Porter and Edouard Machery, entered their experiments with the premise that a poem is something to be solved and poetry, a competition to win. And, from their perspective, they earned a blue ribbon. “AI generated poems are now more human than human,” they confidently pronounced.
In racing to confer god-like status to machine output, what’s lost is what makes a poem a poem: a necessary urge to plumb what it means to have “one wild and precious life,” as Mary Oliver wrote. Poems are meditations on the experience of being, explicitly created from breath and blood by deep listening, an acute attention to subject and language, and an embrace of discovery.
It’s true that reading poems takes time. In fact, one of the art form’s benefits is that it slows readers down and invites them to another place for a few moments.
“Poetry keeps the door open to awe and ensures that we will find our way through the broken heart field of wars, losses and betrayals to understanding, compassion and gathering together,” Joy Harjo reminds us.
In the first of the scientists’ two experiments, they directed ChatGPT to produce texts that imitated poems by 10 poets: Geoffrey Chaucer, William Shakespeare, Samuel Butler, Lord Byron, Walt Whitman, Emily Dickinson, T.S. Eliot, Allen Ginsberg, Sylvia Plath, and Dorothea Lasky. Participants were assigned a poet and asked to determine the origin of the work before them: poet or bot.
The scientists reveal that the study participants “reported a low level of experience with poetry” and “found the task very difficult, and were at least in part answering randomly,” which they explain they tried to account for with an algorithm of their own.
The group of poets included in the study share similarities: all of them are white, seven were born before or in the 19th century, and, in addition to reflecting the vernacular of their day, each has a notably distinct style that could aid in machine mimicry and false positives.
Porter and Machery write that they culled the poets and their poems from an online poetry database and “aimed to cover a wide range of genres, styles, and time periods.”
Not only is the site they used not a high-ranking poetry resource (one can find it named in their study), the content on the site is disclaimed by its anonymous owners to “not be guaranteed to be error-free.” In and of itself, that doesn’t sound like a source that would engender trustworthy scientific results. The site is questionable in other ways. For example, here’s the opening line of the biography of an acclaimed Black woman writer on the site: “Audre Lorde was a 20th century American writer of often angry poetry and prose…”
Oddly, the poet Dorothea Lasky, the only living poet whose poems are included in the experiment, does not even appear on the site Porter and Machery name as their only source for poems. (In an interview with The Washington Post about the study, an upbeat Lasky welcomed the “robot poets.”)
Their second experiment asked a different, smaller group of participants to review poems authored by poets and an equal number of ChatGPT texts imitating poems and then assess them based on the following characteristics: “overall quality, imagery, rhythm, sound, beautiful, inspiring, lyrical, meaningful, moving, original, profound, witty, convey a particular theme, convey a particular mood or emotion, and rhyme.”
Participants rated the bot’s texts higher than poems authored by poets based on these characteristics. However, it should be noted that only rhyme is a specific and distinct formal element, making it easily recognizable. Rhyme, therefore, as in the previous experiment, could have influenced results favorably in the direction of ChatGPT.
Porter and Machery propose that their participants’ apparent preference for algorithmically-generated texts is because they were more straightforward and “generally more accessible” than the poems. The texts, the scientists suggest, therefore “are better at unambiguously communicating an image, a mood, an emotion, or a theme to non-expert readers of poetry, who may not have the time or interest for the in-depth analysis demanded by the poetry of human poets.”
This sounds less like an investigation into poetry and more into communication preferences. It also reveals a questionable assumption about poetry.
It’s true that reading poems takes time. In fact, one of the art form’s benefits is that it slows readers down and invites them to another place for a few moments. But it’s not true that poems sweepingly “demand” “in-depth analysis.” Poems live outside of academic settings, too. They roam free, ready to be engaged with and even enjoyed. Yes, poems ask for a reader’s attention and collaboration, as all creative writing does. But that poems are all inherently difficult is a stereotype that scares readers away from a genre that might offer them personal insight, solace, and inspiration.
The scientists’ second experiment also confirmed that participants judged texts more negatively when it was revealed they were actually produced by ChatGPT. Porter and Machery draw the conclusion that this means there is “a mismatch between readers’ expectations and reality.” This could instead be something more fundamental: a confirmation of a desire to be led by a human guide, to be able to imagine the hand, to take comfort in another being, to know soul.