Auscultation of internal body sounds is essential for diagnosing a range of health conditions, yet its effectiveness is often limited by clinicians' expertise and the acoustic constraints of human hearing, restricting its use across various clinical scenarios. To address these challenges, we introduce AuscultaBase, a foundational framework aimed at advancing body sound diagnostics through innovative data integration and contrastive learning techniques. Our contributions include the following: First, we compile AuscultaBase-Corpus, a large-scale, multi-source body sound database encompassing 11 datasets with 40,317 audio recordings and totaling 322.4 hours of heart, lung, and bowel sounds. Second, we develop AuscultaBase-Model, a foundational diagnostic model for body sounds, utilizing contrastive learning on the compiled corpus. Third, we establish AuscultaBase-Bench, a comprehensive benchmark containing 16 sub-tasks, assessing the performance of various open-source acoustic pre-trained models. Evaluation results indicate that our model outperforms all other open-source models in 12 out of 16 tasks, demonstrating the efficacy of our approach in advancing diagnostic capabilities for body sound analysis.
翻译:内部体音的听诊对于诊断一系列健康状况至关重要,但其有效性常受限于临床医生的专业知识和人类听力的声学约束,从而限制了其在各种临床场景中的应用。为应对这些挑战,我们引入了AuscultaBase,这是一个旨在通过创新的数据整合和对比学习技术推进体音诊断的基础性框架。我们的贡献包括以下方面:首先,我们构建了AuscultaBase-Corpus,这是一个大规模、多来源的体音数据库,涵盖11个数据集,包含40,317个音频记录,总计322.4小时的心音、肺音和肠音。其次,我们开发了AuscultaBase-Model,这是一个用于体音的基础性诊断模型,利用对比学习在构建的语料库上进行训练。第三,我们建立了AuscultaBase-Bench,一个包含16个子任务的综合性基准测试,用于评估各种开源声学预训练模型的性能。评估结果表明,我们的模型在16项任务中的12项上优于所有其他开源模型,证明了我们的方法在提升体音分析诊断能力方面的有效性。