Rodents employ a broad spectrum of ultrasonic vocalizations (USVs) for social communication. As these vocalizations offer valuable insights into affective states, social interactions, and developmental stages of animals, various deep learning approaches have aimed to automate both the quantitative (detection) and qualitative (classification) analysis of USVs. Here, we present the first systematic evaluation of different types of neural networks for USV classification. We assessed various feedforward networks, including a custom-built, fully-connected network and convolutional neural network, different residual neural networks (ResNets), an EfficientNet, and a Vision Transformer (ViT). Paired with a refined, entropy-based detection algorithm (achieving recall of 94.9% and precision of 99.3%), the best architecture (achieving 86.79% accuracy) was integrated into a fully automated pipeline capable of analyzing extensive USV datasets with high reliability. Additionally, users can specify an individual minimum accuracy threshold based on their research needs. In this semi-automated setup, the pipeline selectively classifies calls with high pseudo-probability, leaving the rest for manual inspection. Our study focuses exclusively on neonatal USVs. As part of an ongoing phenotyping study, our pipeline has proven to be a valuable tool for identifying key differences in USVs produced by mice with autism-like behaviors.
翻译:啮齿动物利用广泛的超声发声(USVs)进行社会交流。由于这些发声为动物的情感状态、社会互动及发育阶段提供了宝贵见解,多种深度学习方法已致力于实现USV定量(检测)与定性(分类)分析的自动化。本文首次系统评估了用于USV分类的不同类型神经网络。我们评估了多种前馈网络,包括一个自定义的全连接网络和卷积神经网络(CNN)、不同的残差神经网络(ResNets)、一个EfficientNet以及一个视觉Transformer(ViT)。结合一个改进的、基于熵的检测算法(召回率达到94.9%,精确率达到99.3%),最佳架构(准确率达到86.79%)被集成到一个全自动分析流程中,该流程能够以高可靠性分析大量USV数据集。此外,用户可根据研究需求指定个体化的最低准确率阈值。在此半自动化设置中,该流程选择性地对具有高伪概率的发声进行分类,其余部分则留待人工检查。我们的研究专门聚焦于新生期USVs。作为一项持续进行的表型研究的一部分,我们的分析流程已被证明是识别具有自闭症样行为小鼠所产生USV关键差异的宝贵工具。