Rodents employ a broad spectrum of ultrasonic vocalizations (USVs) for social communication. As these vocalizations offer valuable insights into affective states, social interactions, and developmental stages of animals, various deep learning approaches have aimed to automate both the quantitative (detection) and qualitative (classification) analysis of USVs. Here, we present the first systematic evaluation of different types of neural networks for USV classification. We assessed various feedforward networks, including a custom-built, fully-connected network and convolutional neural network, different residual neural networks (ResNets), an EfficientNet, and a Vision Transformer (ViT). Paired with a refined, entropy-based detection algorithm (achieving recall of 94.9% and precision of 99.3%), the best architecture (achieving 86.79% accuracy) was integrated into a fully automated pipeline capable of analyzing extensive USV datasets with high reliability. Additionally, users can specify an individual minimum accuracy threshold based on their research needs. In this semi-automated setup, the pipeline selectively classifies calls with high pseudo-probability, leaving the rest for manual inspection. Our study focuses exclusively on neonatal USVs. As part of an ongoing phenotyping study, our pipeline has proven to be a valuable tool for identifying key differences in USVs produced by mice with autism-like behaviors.
翻译:啮齿动物利用广泛的超声发声(USVs)进行社会交流。由于这些发声为动物的情感状态、社会互动及发育阶段提供了宝贵见解,多种深度学习方法致力于实现USV定量(检测)与定性(分类)分析的自动化。本文首次系统评估了用于USV分类的不同类型神经网络。我们评估了多种前馈网络,包括自定义的全连接网络和卷积神经网络(CNN)、不同的残差神经网络(ResNets)、一个EfficientNet以及一个视觉Transformer(ViT)。结合改进的基于熵的检测算法(召回率达94.9%,精确率达99.3%),最佳架构(准确率达86.79%)被集成至一个全自动流程中,该流程能够以高可靠性分析大规模USV数据集。此外,用户可根据研究需求设定个体化的最低准确度阈值。在此半自动化设置中,流程仅对具有高伪概率的发声进行选择性分类,其余部分留待人工检查。本研究专门聚焦于新生儿USVs。作为一项持续进行的表型研究的一部分,我们的流程已被证明是识别具有自闭症样行为小鼠所产生USV关键差异的有效工具。