In a groundbreaking move, Google is using its advanced large language models (LLMs) for a purpose far beyond enterprise AI—decoding dolphin communication. The new AI system, called DolphinGemma, is designed to analyze and interpret the complex vocalizations of dolphins, bringing science closer to unlocking the secrets of interspecies communication.
Google’s LLMs, such as Gemini, have already proven valuable in powering intelligent agents, custom chatbots, and automation tools across industries. Now, the company is turning its AI capabilities to the ocean, developing a model specifically tailored to understand dolphin sounds, such as clicks, whistles, and burst pulses.
The project, known as DolphinGemma, is the result of a collaboration between Google, the Georgia Institute of Technology (Georgia Tech), and the Wild Dolphin Project (WDP). WDP has been conducting long-term research on Atlantic spotted dolphins in the Bahamas since 1985, compiling the world’s longest-running dataset of its kind.
Inside DolphinGemma: AI Meets Marine Biology
DolphinGemma is a 400-million-parameter audio-in, audio-out model that processes dolphin sounds similarly to how language models handle human speech. It identifies sound sequences, patterns, and structures that could represent elements of dolphin communication, akin to words and phrases in human languages.
The model uses Google’s SoundStream tokeniser to compress and represent dolphin audio data effectively. Interestingly, the system runs directly on consumer-grade devices—Google Pixel smartphones—making it more accessible and practical for field researchers working in remote ocean environments.
In fact, WDP researchers currently use Pixel 6 smartphones for real-time data analysis during dives, and Pixel 9 devices are slated for deployment in upcoming summer field studies. The integration of commercial smartphones reduces reliance on custom hardware, cutting costs and improving the portability and maintainability of research equipment.
The CHAT System: Building a Human-Dolphin Vocabulary
Alongside DolphinGemma, researchers are utilizing the Cetacean Hearing Augmentation Telemetry (CHAT) system, developed by Georgia Tech and WDP. This underwater computing system attempts to build a shared vocabulary with dolphins using synthetic whistles linked to objects familiar to the animals.
CHAT operates in real time, isolating dolphin sounds from ambient ocean noise and providing feedback to divers via bone-conducting headphones. The system leverages the processing power of Pixel devices to analyze sounds instantly, supporting researchers as they test human-dolphin interaction possibilities.
This real-time, low-latency interaction could be a significant step toward developing a two-way communication channel between humans and dolphins.
Google has announced plans to release DolphinGemma as an open-source model by summer 2025. This release will allow marine researchers across the globe to use and fine-tune the model for studying other cetacean species, such as bottlenose dolphins and spinner dolphins.
Though trained primarily on Atlantic spotted dolphin data, the model is adaptable. Google’s move underscores the importance of collaborative science and aims to democratize tools that can lead to broader discoveries in animal communication and cognition.
“By providing tools like DolphinGemma, we hope to give researchers worldwide the tools to mine their own acoustic datasets, accelerate the search for patterns and collectively deepen our understanding of these intelligent marine mammals,” Google said in a statement.
A Legacy of Dolphin Research
The Wild Dolphin Project has spent nearly four decades conducting non-invasive underwater research. Their extensive dataset, which includes synchronized video, audio, and behavioral annotations, has proven essential in training DolphinGemma.
Over the years, researchers have linked certain vocalizations to specific behaviors. For instance:
- Signature whistles help mothers and calves reunite.
- Squawks often occur during confrontations.
- Buzzes can indicate courtship or reactions to sharks.
DolphinGemma builds on these insights to analyze new data in the field, helping scientists identify recurring acoustic structures that could suggest the presence of syntax or even meaning in dolphin vocalizations.
As AI continues to evolve, Google’s DolphinGemma represents a significant leap toward understanding animal intelligence. By processing dolphin sounds with the precision of a language model, researchers are closer than ever to uncovering the hidden structure behind dolphin communication.
“We’re not just listening anymore,” says Google. “We’re beginning to understand the patterns within the sounds, paving the way for a future where the gap between human and dolphin communication might just get a little smaller.”