Decoding · Sub-page

Combinatorial evidence — does call order matter?

The frontier of decoding. Statistical models say something non-random is happening at the sequence level. Behavioral confirmation is the part we don't have.

The statistical evidence

Treat a sequence of calls from one crow over one observation session as a string of cluster labels. Train a small sequence model (n-grams, transformer, RNN — choose your tool) to predict the next cluster from the previous few. Compare the model's entropy to a shuffled-baseline entropy in which the same call labels appear in random order.

The result, across multiple datasets: the trained model beats the shuffled baseline. Not by a huge margin, but by a statistically meaningful one. Call order is not random; there is sequence-level structure.

What that means, and what it doesn't

The honest reading: certain transitions between call types are more likely than others, and conditional structure exists. This is the same kind of result we'd expect in any system where context matters at all — including systems with no semantic syntax. Repeated calls cluster; alarm calls tend to follow other alarm calls; food-discovery calls tend to precede assembly calls. None of that requires meaning to be carried by combination.

What would distinguish genuine from mere conditional structure is compositionality: different sequences carry different meanings that cannot be predicted from the meanings of their parts. That requires behavioral validation. It requires playback experiments where compositional variants of the same call set produce different responses from the receiving crows.

What playback has and hasn't shown

The few studies that have tried compositional playback have produced mixed and statistically small results. Crows respond differently to some sequence permutations than others, but the effect sizes are modest and the sample sizes are tiny, because designing ethically defensible playback at scale is genuinely hard.

The most we can say in 2026: the door isn't closed. There is suggestive statistical structure and suggestive behavioral response. There is also no clean published demonstration that crow communication is meaningfully compositional. People who tell you otherwise are over-reading the data.

Where this is headed

Two threads in parallel. First, more wearable-logger datasets, which give per-individual long-form sequences in natural context — the substrate sequence models actually need. Second, better playback protocols that can run calibrated A/B comparisons without violating the ethics floor (smaller stimuli, longer between sessions, video required, distress halts).

If crow language is compositional, the evidence will land in the next decade. If it isn't, that's a useful finding too: it would mean that crow communication is extraordinarily rich at the single-call level but does something other than syntax at the sequence level. Either is interesting. Neither is yet decided.

Navigate CrowLingo

Combinatorial evidence — does call order matter?

The statistical evidence

What that means, and what it doesn't

What playback has and hasn't shown

Where this is headed