Rodents employ a broad spectrum of ultrasonic vocalizations (USVs) for social communication. As these vocalizations offer valuable insights into affective states, social interactions, and developmental stages of animals, various deep learning approaches have aimed to automate both the quantitative (detection) and qualitative (classification) analysis of USVs. Here, we present the first systematic evaluation of different types of neural networks for USV classification. We assessed various feedforward networks, including a custom-built, fully-connected network and convolutional neural network, different residual neural networks (ResNets), an EfficientNet, and a Vision Transformer (ViT). Paired with a refined, entropy-based detection algorithm (achieving recall of 94.9% and precision of 99.3%), the best architecture (achieving 86.79% accuracy) was integrated into a fully automated pipeline capable of analyzing extensive USV datasets with high reliability. Additionally, users can specify an individual minimum accuracy threshold based on their research needs. In this semi-automated setup, the pipeline selectively classifies calls with high pseudo-probability, leaving the rest for manual inspection. Our study focuses exclusively on neonatal USVs. As part of an ongoing phenotyping study, our pipeline has proven to be a valuable tool for identifying key differences in USVs produced by mice with autism-like behaviors.
翻译:啮齿类动物利用广泛的超声波发声(USVs)进行社交交流。由于这些发声能够提供关于动物情绪状态、社交互动及发育阶段的宝贵信息,多种深度学习方法已致力于自动化实现USVs的定量(检测)与定性(分类)分析。本文首次系统评估了不同类型神经网络在USV分类中的表现。我们评估了多种前馈网络,包括自建的全连接网络与卷积神经网络、不同残差网络(ResNets)、EfficientNet以及视觉Transformer(ViT)。结合优化的基于熵的检测算法(召回率达94.9%,精确率达99.3%),性能最佳的架构(准确率达86.79%)被集成到全自动管线中,能够高可靠性地分析大规模USV数据集。此外,用户可根据研究需求设定个性化最低准确率阈值。在该半自动设定下,管线优先对高伪概率的叫声进行分类,其余部分则留待人工核查。本研究仅聚焦于新生期USVs。作为一项正在进行的表型研究的一部分,该管线已证明是识别具有自闭症样行为小鼠USVs关键差异的有力工具。