This paper presents a deep learning system applied for detecting anomalies from respiratory sound recordings. Initially, our system begins with audio feature extraction using Gammatone and Continuous Wavelet transformation. This step aims to transform the respiratory sound input into a two-dimensional spectrogram where both spectral and temporal features are presented. Then, our proposed system integrates Inception-residual-based backbone models combined with multi-head attention and multi-objective loss to classify respiratory anomalies. In this work, we conducted experiments over the benchmark dataset of SPRSound (The Open-Source SJTU Paediatric Respiratory Sound) proposed by the IEEE BioCAS 2022 challenge. As regards the Score computed by an average between the average score and harmonic score, our proposed system gained significant improvements of 9.7%, 15.8%, 17.0%, and 9.4% in Task 1-1, Task 1-2, Task 2-1, and Task 2-2 compared to the challenge baseline system. Notably, we achieved the Top-1 performance in Task 2-1 with the highest Score of 73.7%.
翻译:本文提出了一种用于检测呼吸音记录异常的深度学习系统。初始阶段,系统采用Gammatone滤波器组与连续小波变换进行音频特征提取,旨在将呼吸音输入转换为同时呈现频谱与时间特征的二维声谱图。随后,所提系统集成基于Inception-Residual的主干网络,结合多头注意力机制与多目标损失函数实现呼吸异常分类。本文基于IEEE BioCAS 2022挑战赛提出的SPRSound(上海交通大学开源儿科呼吸音数据集)基准数据集开展实验。以平均分数与调和分数的综合评分作为评估指标,相较于挑战赛基线系统,本系统在任务1-1、任务1-2、任务2-1及任务2-2中分别取得9.7%、15.8%、17.0%与9.4%的显著提升。值得注意的是,本系统在任务2-1中以73.7%的最高评分获得榜首性能。