Using Machine Learning to Decode the Complex Vocalizations of Sperm Whales

Sperm whales possess the largest brains on Earth, and for decades, scientists suspected their complex clicks held deep meaning. Now, a massive scientific breakthrough has occurred. By applying advanced artificial intelligence to underwater audio, marine biologists have discovered a phonetic alphabet hidden within these ocean calls.

The Challenge of Understanding Whale Speak

Sperm whales communicate in the dark depths of the ocean using rapid bursts of clicks known as codas. These clicks are incredibly powerful, reaching up to 230 decibels, making them some of the loudest sounds produced by any animal. For nearly twenty years, biologists have tried to understand exactly what these animals are saying to one another.

Biologist Shane Gero founded the Dominica Sperm Whale Project in 2005 to study the Eastern Caribbean sperm whale clan. Over the years, his team dropped hydrophones (underwater microphones) into the water and recorded thousands of hours of acoustic data. They observed that families of whales had specific dialects and recognized each other through sound.

However, human ears and traditional audio software could only process a tiny fraction of this information. The audio files were massive, filled with overlapping clicks, ocean background noise, and sounds from other marine life. The patterns were simply too fast and too complex for scientists to decode manually.

Enter Project CETI and MIT CSAIL

The major turning point arrived when marine biologists teamed up with computer scientists. Project CETI (the Cetacean Translation Initiative) is a nonprofit research group dedicated to understanding whale communication. They partnered with researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) to bring artificial intelligence into the ocean.

In May 2024, the team published a groundbreaking study in the journal Nature Communications. Lead researcher Pratyusha Sharma and her colleagues took the massive archive of audio from the Dominica Sperm Whale Project and fed it into highly advanced machine learning algorithms.

Instead of relying on human researchers to manually tag and categorize every sound, the AI analyzed over 8,700 individual sperm whale codas. By training deep learning models to detect subtle acoustic variations, the software mapped out structural features in the clicks that scientists had previously missed.

Uncovering the Sperm Whale Phonetic Alphabet

What the machine learning models revealed was staggering. The AI proved that sperm whale clicks are not just random noise or basic identification tags. Instead, they form a highly structured communication system with distinct building blocks.

The researchers identified four key components that make up the newly discovered sperm whale phonetic alphabet:

  • Rhythm: This is the basic, foundational sequence of the clicks within a coda.
  • Tempo: This refers to the overall speed at which the whales deliver the entire acoustic sequence.
  • Rubato: The researchers borrowed this musical term to describe how whales smoothly speed up or slow down their click rate mid-sequence.
  • Ornamentation: This is the addition of a single extra click at the very end of a coda. It acts much like a suffix in human language or an exclamation point at the end of a sentence.

By mixing and matching these four elements, sperm whales can create a vast array of different vocalizations. The AI showed that whales intentionally alter their rubato and use ornamentation based on the conversational context with other members of their pod.

How AI Audio Processing Makes This Possible

Traditional marine biology relied heavily on scientists staring at visual representations of sound called spectrograms. This manual process was slow and prone to human error. Machine learning completely changed the rules of acoustic processing.

AI algorithms excel at pattern recognition. The software can filter out the ambient noise of waves and boat engines, isolating the specific frequencies of whale clicks. More importantly, the machine learning models look at the microscopic spaces between the clicks, measuring differences down to the millisecond.

Human analysts might hear two codas and classify them as identical. The AI, however, can detect that one coda featured a slight rubato, changing the acoustic structure entirely. Analyzing 8,700 codas manually would take years, but the AI processed millions of data points in a fraction of the time, revealing the underlying rules of whale syntax.

Next Steps for Marine Communication

We cannot translate these clicks into English just yet. Scientists still do not know if a specific rhythm means “giant squid nearby” or if a certain ornamentation signals a warning. However, establishing this phonetic alphabet is the required first step toward actual translation.

David Gruber, the founder and president of Project CETI, notes that the next phase involves linking these specific audio patterns to physical behaviors. To do this, researchers will deploy underwater drones, robotic fish, and non-invasive tagged sensors to record the whales’ movements in 3D space.

By using AI to analyze both the audio alphabet and the physical behavior simultaneously, scientists hope to eventually map specific phrases to specific actions. For now, machine learning has proven that animal communication is far more complex and structured than we ever imagined.

Frequently Asked Questions

What is a sperm whale coda? A coda is a short, rapid burst of clicks that sperm whales use to communicate with one another. Unlike their echolocation clicks used for hunting, codas are specifically used for socializing.

Can AI translate what whales are saying to each other? Not yet. The machine learning algorithms have identified the underlying structure and phonetic alphabet of the sounds, but scientists do not yet know the specific meanings or definitions of these click patterns.

Where did the data for this AI study come from? The AI analyzed over 8,700 codas collected by the Dominica Sperm Whale Project. This project has been recording the Eastern Caribbean clan of sperm whales off the coast of Dominica since 2005.

What is Project CETI? Project CETI (Cetacean Translation Initiative) is a multidisciplinary nonprofit organization that uses advanced machine learning, robotics, and linguistics to decode the communication of sperm whales.