A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations

Thumbnail Image
Date
2022-06-03
Authors
Thomas, Mara
Jensen, Frants H.
Averly, Baptiste
Demartsev, Vlad
Manser, Marta B.
Sainburg, Tim
Roch, Marie
Strandburg-Peshkin, Ariana
Linked Authors
Alternative Title
Date Created
Location
DOI
10.1111/1365-2656.13754
Related Materials
Replaces
Replaced By
Keywords
animal sounds
animal vocalizations
bioacoustics
call classification
dimensionality reduction
spectrogram
UMAP
unsupervised learning
Abstract
1. Background: The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. 2. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. 3. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.
Description
© The Author(s), 2022. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Thomas, M., Jensen, F. H., Averly, B., Demartsev, V., Manser, M. B., Sainburg, T., Roch, M. A., & Strandburg-Peshkin, A. A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations. The Journal of Animal Ecology, 91(8), (2022): 1567– 1581, https://doi.org/10.1111/1365-2656.13754.
Embargo Date
Citation
Thomas, M., Jensen, F. H., Averly, B., Demartsev, V., Manser, M. B., Sainburg, T., Roch, M. A., & Strandburg-Peshkin, A. (2022). A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations. The Journal of Animal Ecology, 91(8), 1567– 1581.
Cruises
Cruise ID
Cruise DOI
Vessel Name
Collections
Except where otherwise noted, this item's license is described as Attribution-NonCommercial 4.0 International