Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires

Sainburg, Tim; Thielk, Marvin; Gentner, Timothy Q

Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires

Tim Sainburg, Marvin Thielk and Timothy Q Gentner

PLOS Computational Biology, 2020, vol. 16, issue 10, 1-48

Abstract: Animals produce vocalizations that range in complexity from a single repeated call to hundreds of unique vocal elements patterned in sequences unfolding over hours. Characterizing complex vocalizations can require considerable effort and a deep intuition about each species’ vocal behavior. Even with a great deal of experience, human characterizations of animal communication can be affected by human perceptual biases. We present a set of computational methods for projecting animal vocalizations into low dimensional latent representational spaces that are directly learned from the spectrograms of vocal signals. We apply these methods to diverse datasets from over 20 species, including humans, bats, songbirds, mice, cetaceans, and nonhuman primates. Latent projections uncover complex features of data in visually intuitive and quantifiable ways, enabling high-powered comparative analyses of vocal acoustics. We introduce methods for analyzing vocalizations as both discrete sequences and as continuous latent variables. Each method can be used to disentangle complex spectro-temporal structure and observe long-timescale organization in communication.Author summary: Of the thousands of species that communicate vocally, the repertoires of only a tiny minority have been characterized or studied in detail. This is due, in large part, to traditional analysis methods that require a high level of expertise that is hard to develop and often species-specific. Here, we present a set of unsupervised methods to project animal vocalizations into latent feature spaces to quantitatively compare and develop visual intuitions about animal vocalizations. We demonstrate these methods across a series of analyses over 19 datasets of animal vocalizations from 29 different species, including songbirds, mice, monkeys, humans, and whales. We show how learned latent feature spaces untangle complex spectro-temporal structure, enable cross-species comparisons, and uncover high-level attributes of vocalizations such as stereotypy in vocal element clusters, population regiolects, coarticulation, and individual identity.

Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008228 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 08228&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1008228

DOI: 10.1371/journal.pcbi.1008228

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().