Decoding · Sub-page
Contextual clustering — geometry meets behavior.
A cluster is just a number until you join it to what the crow was doing. The join is where geometry turns into meaning.

The join
The pipeline (stages 5–7) produces two tables: an audio table with cluster IDs, and a behavior table with timestamped observations. Joined by time window — usually a few-hundred-millisecond tolerance around the call — they yield a third table: per cluster, the distribution of behaviors that co-occurred.
The shape of that distribution is the signal. A cluster whose calls occur 80% during territorial defense and 5% during foraging is doing something different from a cluster that splits evenly across contexts. The first is interpretable; the second is either an encoding artifact or a genuinely context-generic call type (greetings, contact).
What the wearable-logger work showed
The 2026 carrion-crow paper (Demartsev et al., bioRxiv) is the cleanest recent example. The team deployed wearable audio loggers on a cooperatively breeding crow population, capturing audio and accelerometry per individual. The behavior log was the accelerometer trace, time-aligned to the second.
When they clustered the vocal embeddings and joined to the accelerometry-derived behavioral states, they recovered both the discrete repertoire structure (clusters that map cleanly to a single behavior) and graded structure (grunts that vary continuously with motor activity). The latter is the part that was invisible before — graded variation that the old hand-labeling regime had to either squeeze or discard.
What this is not
It is not "the crow says X means Y." The joined-distribution captures co-occurrence, not semantics. We don't know whether the call causes the behavior, describes the behavior, or simplyaccompanies the behavioral state.
Distinguishing those would require intervention — playback experiments with calibrated control — which the ethics floor constrains heavily and which the Respond stage only barely starts to address.
The honest interpretation
Contextual clustering gives you a probabilistic map from acoustic form to behavioral context. That map is a strong foundation for designing playback experiments, for spotting outlier calls worth investigating, and for distinguishing repertoire change over time. It is not, by itself, a dictionary.