It turns out all the jokes about the sci-fi movie AI companion “Her“—which even OpenAI CEO Sam Altman encouraged—were no laughing matter.
After OpenAI released its multimodal language model GPT-4o touting its ability to interact via voice, the company Monday announced that it is pausing the development of the voice named “Sky.” The decision had come amid a wave of suggestive comments, memes, and comparisons to the “Her” character, voiced by Scarlett Johansson.
Within hours, the actress released a statement announcing that she had retained legal counsel after she had twice turned down requests to use her voice prior to last week's unveiling.
"I was shocked, angered, and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine," Johansson said. Her attorneys sent letters to Altman and OpenAI, asking them to explain how they created the “Sky” voice.
Notably, Altman cryptically tweeted the single word “her” just ahead of the unveiling, possibly within hours of his second attempt to secure Johansson's participation—an action highlighted in her statement.
The reponse from OpenAI—including a blog post detailing how the voices were developed—seems to directly anwser one of Johansson's questions.
"We've heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them," OpenAI stated yesterday.
Joanne Jang, model behavior lead at OpenAI, acknowledged the confusion and concerns surrounding the “Sky” voice, telling The Verge, "We want to take the feedback seriously and hear out the concerns.”
OpenAI's new voices were unveiled last week as a demonstration of the company's efforts to provide more human-like and natural conversations with its AI chatbot. The voice was designed to understand and respond to emotional cues, showcasing the advanced capabilities of GPT-4o, but with human-like touches like pauses, sighs, and laughs.
Once it was avialable to the public, users began to test the model's boundaries and flirt with the voice. A flurry of tweets described Sky as “flirty,” “sexy,” and “provocative,” with some users joking that they now had a new girlfriend or were being seduced by the AI voice.
The 2013 Spike Jonze film "Her" was a constant point of comparison, a film in which Scarlett Johansson voiced an AI assistant that enchants a lonely writer. The situation escalated with comedy sketches referencing Sky's sultry vocals and apparent resemblance to the actress —a similarity acknowledged by OpenAI co-founder Andrej Karpathy.
On Friday, CEO Sam Altman tweeted about potential changes to Sky, assuring people that the new model was not publicly available.
On Sunday, the company took its first official steps to address the brewing controversy, explaining that it worked with voice actors to select and train its voice models. In the blog post, the company denied using Johansson's voice as a template for “Sky.”professional actress.
"We worked with industry-leading casting and directing professionals to narrow down over 400 submissions before selecting the five voices,” OpenAI explained. “Each of the voices—Breeze, Cove, Ember, Juniper and Sky—are sampled from voice actors we partnered with to create them.”
“We believe that AI voices should not deliberately mimic a celebrity's distinctive voice—Sky's voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice,” the post notes.
OpenAI said it would not reveal the actors’ identities due to privacy reasons. The company also revealed that it plans to introduce additional voices in the future to better match the diverse interests and preferences of users.
Probably none of those will flirt with you, however, based on the latest fallout.
Edited by Andrew Hayward. This article has been updated to include an official statement from Scarlett Johansson.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.