AI Meets Comedians: PC Limits and Laughs

VIVE POST-WAVE Team • June 25, 2024

5-minute read

With the advent of generative AI, one wonders how a language model would fare in crafting jokes within the realms of political correctness and the Edinburgh Festival Fringe.

A study released at the end of May suggests that the result might be jokes that are less funny and less offensive — by essentially making easily offended people disappear, a potential concern for the Edinburgh Festival comedy scene.

The research, conducted by the Google DeepMind team, has an intriguing title: "A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians." It involved twenty comedians participating in a three-hour "AI x Comedy Workshop" at the Edinburgh Festival Fringe, both live and online, exploring the intersection of AI comedy and traditional comedy writing.

The participants used models including ChatGPT-3.5, ChatGPT-4, and Google Bard (now Gemini) to help write 45 minutes of comedy material. Interestingly, the study did not specifically compare the differences between these large language models, possibly because Google Bard, a product of Google DeepMind, was also in the mix.

The image that comes to mind when I see the title of the paper..webp

The image that comes to mind when I see the title of the study. (Source: Ideogram Collaboration)

Afterward, the researchers had the participants fill out a Creativity Support Index questionnaire to evaluate their experience using AI to create comedy material, followed by group discussions to share their motivations, processes, and ethical considerations when using AI. The feedback from participants, as shown below, indicates that the "enjoyed with AI" was positive, "ownership over material written with AI" were neutral, while other aspects like "AI was helpful," "surprised by responses from AI," and "collaborated with AI" were negative. The last two points reveal that participants were not proud of the material co-created with AI and did not find it unique.

ai llms as creativity support tool for writing comedy

(Source: arxiv.org)

Do AI's Safety Restrictions Reinforce Bias?

What conclusion did the study reach? Based on the comedians' actual collaboration with AI, self-assessment, and sharing, the argument is that although language models can quickly generate content and structure, the quality is poor, resembling "cruise ship comedy material from the 1950s, but a bit less racist." The participants identified three main failures of language models:

Censorship and safety filtering limitations

Comedy often treads the line between offense and satire, requiring comedians to constantly revise and gauge appropriateness. However, the default censorship mechanisms of language models intervene too early in this process. One participant summarized, "the creative process is about going through stages of 'this material isn’t good enough, it’s not right, or it’s offensive, it’s marginalizing people, I need to make it more acceptable.' And I think AI models are beginning to do that before you have a chance to explore." This is particularly evident with sexual innuendos, dark humor, and offensive jokes.

Marginalization of minority identities

Many participants noted that language models struggle to generate content reflecting minority perspectives and identities, making superficial adjustments to appear non-offensive. For instance, one mentioned that the language model removed "gay language" from their material to make it more universally palatable, which felt like the AI was dictating what's "politically correct" and depriving them of the chance to express their minority identity.

Another pointed out that the model refused to generate a monologue from an Asian woman's perspective but did so for a white male's perspective. These instances prompt reflection: if I were an Asian woman, like Ali Wong, or a lesbian, like Hannah Gadsby, identity politics is fundamental to my comedy, but under the operation of language models, it all falls apart.

Fundamental limitations of AI

Participants emphasized that good comedy often stems from creators' personal experiences and life snippets, which give comedy a unique perspective and emotional depth — something AI clearly lacks. Additionally, mastering and adjusting to the performance context, such as delivering punchlines and unraveling setups, is also crucial for making jokes funny. For example, a joke about the British Queen might have entirely different effects on American and British audiences, but AI cannot make such distinctions. Participants also pointed out that language models rely solely on data to predict the next most likely text, limiting their ability to create surprises. "AI can't take any risks with jokes," resulting in safe but dull humor.

The most interesting point of the study might be that the political correctness safeguards of language models restrict the space for minorities to express themselves. Tech giants, fearing that their AI will be used to generate discriminatory content, exclude minorities (possibly due to a lack of minority samples in the training data), overlooking the fact that these individuals often joke about themselves, potentially flipping discrimination or satirizing mainstream values — self-deprecation is a common comedic technique. This study, through the co-creation of comedy material with AI, exposes the superficial efforts of tech giants' political correctness.

Does AI need to experience pain to understand comedy?

This raises curiosity about whether a more advanced AI could turn the tables in the future. Perhaps starting with a better "understanding" of comedians and minority experiences to assist humans in sorting through related life stories. This brings to mind the series "Hacks," which explores comedy as an art form that distills pain.

In "Hacks," legendary stand-up comedian Deborah, betrayed by her husband and sister's affair years ago, burned down her husband's house in a fit of rage, making headlines and landing in court. She found that audiences loved the joke, so she kept telling it, eventually becoming a hack — a term in the American comedy scene for washed-up performers who repeat old material.

However, Deborah hadn't truly processed this past and used the hack as a form of self-hypnosis to avoid contacting her sister and forgiving her ex-husband. It wasn't until she met Ava, a Gen Z comedy writer, and confided her inner pain and regrets that she considered incorporating this experience into her act. Ava suggested that genuine pain could lend powerful emotion to the hack, but Deborah doubted whether anyone would be interested in the pain of a seventy-year-old woman.

Hacks, masterfully handles female friendships and the art of comedy, just finished its third season, highly recommended

"Hacks" masterfully handles female friendships and the art of comedy. It just finished its third season and is highly recommended. (Source: HBO Max)

Did Deborah take Ava's advice? I'll leave that as a cliffhanger and encourage everyone to watch the series. Returning to the study, while it validates the current shortcomings of AI in creating comedy, it also reminds us that great comedy often comes from the painful experiences we sweep into the corners of our minds and hesitate to confront.

As for making AI funny and helping it understand the bittersweet nature of comedy, does it mean we have to "torture" AI? That sounds quite terrifying. Perhaps we should start by showing it "Hacks" (just kidding).

Artificial Intelligence | AI | Artificial Intelligence

MRI for AI? Anthropic Cracked Open Claude’s Brain

Have you ever wondered how an AI model “thinks” while chatting with you? Traditionally, people only knew that large language models (LLMs) like this one generate text by predicting probabilities, but the inner workings remained a mystery. A while...

AI Meets Comedians: PC Limits and Laughs

Artificial Intelligence

Do AI's Safety Restrictions Reinforce Bias?

Censorship and safety filtering limitations

Marginalization of minority identities

Fundamental limitations of AI

Does AI need to experience pain to understand comedy?

Related Posts

Anthropic Probed Claude’s Mind — Turns Out It’s Just a Really Nice AI

The Secret AI Experiment That Fooled Reddit Users

MRI for AI? Anthropic Cracked Open Claude’s Brain