Why try to understand Gen Z slang when it might be easier to communicate with animals?
Today, Google unveiled DolphinGemma, an open-source AI model designed to decode dolphin communication by analyzing their clicks, whistles, and burst pulses. The announcement coincided with National Dolphin Day.
The model, created in partnership with Georgia Tech and the Wild Dolphin Project (WDP), learns the structure of dolphins’ vocalizations and can generate dolphin-like sound sequences.
The breakthrough could help determine whether dolphin communication rises to the level of language or not.
Trained on the world's longest-running underwater dolphin research project, DolphinGemma leverages decades of meticulously labeled audio and video data collected by WDP since 1985.
Did Films 'The Brutalist' and 'Emilia Pérez' Cross the Line with AI Voice Editing?
Golden Globe-winning films The Brutalist and Emilia Pérez have come under fire for using AI to modify actors' voices in their critically acclaimed films. "The Brutalist," starring Adrien Brody, Felicity Jones, and Guy Pearce, depicts a visionary architect who, after escaping postwar Europe, revitalizes his life and career in America with the help of a wealthy industrialist. "Emilia Pérez," meanwhile, follows Manitas del Monte, a Mexican cartel boss who becomes Emilia Pérez, played by Karla Sofia...
The project has studied Atlantic Spotted Dolphins in the Bahamas across generations using a non-invasive approach they call "In Their World, on Their Terms."
"By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication—a task previously requiring immense human effort," Google said in its announcement.
The AI model, which contains roughly 400 million parameters, is small enough to run on Pixel phones that researchers use in the field. It processes dolphin sounds using Google's SoundStream tokenizer and predicts subsequent sounds in a sequence, much like how human language models predict the next word in a sentence.
DolphinGemma doesn't operate in isolation. It works alongside the CHAT (Cetacean Hearing Augmentation Telemetry) system, which associates synthetic whistles with specific objects dolphins enjoy, such as sargassum, seagrass, or scarves, potentially establishing a shared vocabulary for interaction.
"Eventually, these patterns, augmented with synthetic sounds created by the researchers to refer to objects with which the dolphins like to play, may establish a shared vocabulary with the dolphins for interactive communication," according to Google.
Field researchers currently use Pixel 6 phones for real-time analysis of dolphin sounds.
The team plans to upgrade to Pixel 9 devices for the summer 2025 research season, which will integrate speaker and microphone functions while running both deep learning models and template matching algorithms simultaneously.
The shift to smartphone technology dramatically reduces the need for custom hardware, a crucial advantage for marine fieldwork. DolphinGemma's predictive capabilities can help researchers anticipate and identify potential mimics earlier in vocalization sequences, making interactions more fluid.

Harrison Ford: 'Indiana Jones' Game Voice Actor Shows You Don't Need AI to 'Steal My Soul'
In a recent interview, sci-fi acting legend Harrison Ford spoke out about AI and how little a threat it poses to actors. Ford told WSJ that he did not feel artificial intelligence will be able to steal the jobs of actors anytime soon, pointing out that it just can't replace the quality that actors offer. When asked in the interview if he was taking steps to secure his likeness—in the wake of so many studios getting closer to having fully synthetic versions of the talent—the Indiana Jones and Han...
Understanding what cannot be understood
DolphinGemma joins several other AI initiatives aimed at cracking the code of animal communication.
The Earth Species Project (ESP), a nonprofit organization, recently developed NatureLM, an audio language model capable of identifying animal species, approximate age, and whether sounds indicate distress or play—not really language, but still, ways of establishing some primitive communication.
The model, trained on a mix of human language, environmental sounds, and animal vocalizations, has shown promising results even with species it hasn't encountered before.
Project CETI represents another significant effort in this space.
Led by researchers including Michael Bronstein from Imperial College London, it focuses specifically on sperm whale communication, analyzing their complex patterns of clicks used over long distances.
The team has identified 143 click combinations that might form a kind of phonetic alphabet, which they're now studying using deep neural networks and natural language processing techniques.
While these projects focus on decoding animal sounds, researchers at New York University have taken inspiration from baby development for AI learning.
Their Child's View for Contrastive Learning model (CVCL) learned language by viewing the world through a baby's perspective, using footage from a head-mounted camera worn by an infant from 6 months to 2 years old.
Grateful Dead’s Jerry Garcia Resurrected By AI Using His Voice
Jerry Garcia may be gone, but he’s not forgotten: A new app will now read your email or any other text in his voice. In a deal with the Garcia Estate, AI developer ElevenLabs recently added the legendary Grateful Dead frontman to its “Iconic Voice Project” lineup. Part e-reader, part audiobook streaming platform, the ElevenReader mobile app was designed as a way for people to listen to books, articles, PDFs, and even emails in the voice of a historical figure. Other notable voices in the app inc...
The NYU team found that their AI could learn efficiently from naturalistic data similar to how human infants do, contrasting sharply with traditional AI models that require trillions of words for training.
Google plans to share an updated version of DolphinGemma this summer, potentially extending its utility beyond Atlantic spotted dolphins. Still, the model may require fine-tuning for different species' vocalizations.
WDP has focused extensively on correlating dolphin sounds with specific behaviors, including signature whistles used by mothers and calves to reunite, burst-pulse "squawks" during conflicts, and click "buzzes" used during courtship or when chasing sharks.
"We're not just listening anymore," Google noted. "We're beginning to understand the patterns within the sounds, paving the way for a future where the gap between human and dolphin communication might just get a little smaller."
Edited by Sebastian Sinclair and Josh Quittner