Compared with invasive examinations that require tissue sampling, respiratory sound testing is a non-invasive examination method that is safer and easier for patients to accept. In this study, we introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition. Rene has been rigorously fine-tuned with an extensive dataset featuring a broad array of respiratory audio samples, targeting disease detection, sound pattern classification, and event identification. Our innovative approach applies a pre-trained speech recognition model to process respiratory sounds, augmented with patient medical records. The resulting multi-modal deep-learning framework addresses interpretability and real-time diagnostic challenges that have hindered previous respiratory-focused models. Benchmark comparisons reveal that Rene significantly outperforms existing models, achieving improvements of 10.27%, 16.15%, 15.29%, and 18.90% in respiratory event detection and audio classification on the SPRSound database. Disease prediction accuracy on the ICBHI database improved by 23% over the baseline in both mean average and harmonic scores. Moreover, we have developed a real-time respiratory sound discrimination system utilizing the Rene architecture. Employing state-of-the-art Edge AI technology, this system enables rapid and accurate responses for respiratory sound auscultation(https://github.com/zpforlove/Rene).
翻译:与需要组织采样的侵入性检查相比,呼吸音测试是一种非侵入性检查方法,对患者而言更安全且更易接受。本研究介绍了Rene,一种专为呼吸音识别而设计的开创性大规模模型。Rene已通过包含广泛呼吸音频样本的大规模数据集进行了严格微调,目标涵盖疾病检测、声音模式分类和事件识别。我们的创新方法采用预训练的语音识别模型处理呼吸音,并结合患者病历信息进行增强。由此构建的多模态深度学习框架解决了以往呼吸领域模型在可解释性和实时诊断方面面临的挑战。基准测试比较表明,Rene在SPRSound数据库的呼吸事件检测和音频分类任务中显著优于现有模型,分别实现了10.27%、16.15%、15.29%和18.90%的性能提升。在ICBHI数据库的疾病预测任务中,其平均准确率与调和分数较基线模型均提高了23%。此外,我们基于Rene架构开发了实时呼吸音鉴别系统。该系统采用前沿的Edge AI技术,能够为呼吸音听诊提供快速准确的分析响应(https://github.com/zpforlove/Rene)。