OpenAI Shows Off Voice Chatbot, Touts New GPT-4o AI Model

The live demonstration of new conversational capabilities fueled comparisons to the virtual companion depicted in the movie “Her.”

By Jason Nelson

May 13, 2024

5 min read

Image: photosince/Shutterstock

Add on Google

ChatGPT developer OpenAI today announced its latest AI model, GPT-4o—the “O” stands for “omnimodel”—during a spring product update live stream, along with a slew of product updates, including a voice chatbot.

OpenAI updated its mobile apps immediately following its announcements and also launched a desktop app for ChatGPT. The company emphasized improvements to its user experience, which it says allows people to better focus on the conversations they have with ChatGPT.

“For the past couple of years, we’ve been very focused on improving the intelligence of these models, and they’ve gotten pretty good,” OpenAI chief technology officer Mira Murati said. “But this is the first time that we are really making a huge step forward when it comes to ease of use.”

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx

— OpenAI (@OpenAI) May 13, 2024

The livestream emphasized a simplified and more holistic approach to generative AI. An “omnimodel”—or natively multimodal—system does everything within its core application instead of coordinating among GPT for text, GPT Vision for images, and so on.

“We think it's very, very important that people have an intuitive feel for what the technology can do, so we really want to pair it with this broader understanding,” Murati said.

She noted that GPT-4o will be available to both paid and free ChatGPT users, as well as users of ChatGPT’s API. Paid ChatGPT subscribers, Murati added, will continue to have access to up to five times the system capacity of free users. Everyone, she said, should be able to access OpenAI tools.

”We're always finding ways to reduce that friction, and recently, we made ChatGPT available without the signup flow,” she noted. In April, OpenAI allowed users to access ChatGPT 3.5 without signing up for an account.

OpenAI then showcased ChatGPT’s ability to hold a real-time casual conversation with users, demonstrating a variety of tones and emotions. The demo included ChatGPT singing, laughing, and joking with the OpenAI engineers. The company also claimed that ChatGPT can now determine a user’s emotional state using the mobile phone’s front-facing camera.

A new blog post outlined the major developments announced today, leading with “much more natural human-computer interaction.”

“It accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs,” the company wrote. “It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation.”

Even before today's announcements, AI and tech enthusiasts suggested a voice chatbot powered by a next-generation AI model would make the personal companions depicted in the sci-fi movie “Her” a reality—including OpenAI CEO Sam Altman, in a cryptic, one-word Twitter post.

her

— Sam Altman (@sama) May 13, 2024

Using the ChatGPT desktop application, the OpenAI engineers showed that software code could be copied into ChatGPT, allowing the engineer to chat with ChatGPT about it. In the demo, OpenAI also showcased ChatGPT’s ability to perform real-time language translations across 20 languages. ChatGPT was additionally shown explaining a math problem after a photo of the equation was submitted to the app.

OpenAI and the broader generative AI industry have publicly committed to fight the use of their technology in the creation of AI-generated deepfakes. OpenAI acknowledged today that GPT-4o presents new safety challenges given its real-time audio and vision capabilities.

“Our team has been hard at work figuring out how to build mitigations against misuse,” Murati said. “We continue to work with different stakeholders out there—from government, media, entertainment, red teamers, and civil society—to figure out how to best bring these technologies into the world.”

Rumors had been circulating since the beginning of the month about OpenAI’s big announcement, ranging from the release of GPT-5, ChatGPT powering Apple’s new version of Siri, and AI-powered search ahead of Google’s anticipated announcement on May 14. On Friday, Bloomberg reported that OpenAI and Apple closed a deal that would bring OpenAI’s technology to the iPhone.

NEW: Apple and OpenAI expected to announce iPhone partnership today with a new ai-powered voice assistant.

You're all getting girlfriends... 😅 pic.twitter.com/6dx9SxdcWE

— Radar🚨 (@RadarHits) May 13, 2024

OpenAI CEO Sam Altman took to Twitter to calm the waters on Friday, tweeting, “Not GPT-5, not a search engine, but we’ve been hard at work on some new stuff we think people will love! Feels like magic to me.”

not gpt-5, not a search engine, but we’ve been hard at work on some new stuff we think people will love! feels like magic to me.

monday 10am PT. https://t.co/nqftf6lRL1

— Sam Altman (@sama) May 10, 2024

Launched in 2015 by Sam Altman, Elon Musk, Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, Jessica Livingston, John Schulman, Pamela Vagata, and Wojciech Zaremba, OpenAI and its wildly popular ChatGPT released in November 2022 have dominated the conversation surrounding generative AI.

With close ties and investments from Microsoft, OpenAI’s ChatGPT and Dall-E 3 have been integrated into Microsoft's suite of Office 365 tools and the new Copilot AI assistant.

In March, Musk sued OpenAI and Altman, claiming that the AI developer had prioritized Microsoft’s commercial interests over the public good.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Coin Prices