Auscultation for neonates is a simple and non-invasive method of providing diagnosis for cardiovascular and respiratory disease. Such diagnosis often requires high-quality heart and lung sounds to be captured during auscultation. However, in most cases, obtaining such high-quality sounds is non-trivial due to the chest sounds containing a mixture of heart, lung, and noise sounds. As such, additional preprocessing is needed to separate the chest sounds into heart and lung sounds. This paper proposes a novel deep-learning approach to separate such chest sounds into heart and lung sounds. Inspired by the Conv-TasNet model, the proposed model has an encoder, decoder, and mask generator. The encoder consists of a 1D convolution model and the decoder consists of a transposed 1D convolution. The mask generator is constructed using stacked 1D convolutions and transformers. The proposed model outperforms previous methods in terms of objective distortion measures by 2.01 dB to 5.06 dB in the artificial dataset, as well as computation time, with at least a 17-time improvement. Therefore, our proposed model could be a suitable preprocessing step for any phonocardiogram-based health monitoring system.
翻译:听诊对于新生儿是一种简单且无创的心血管和呼吸系统疾病诊断方法。这类诊断通常需要在听诊过程中获取高质量的心音和肺音。然而,在多数情况下,由于胸部声音包含心音、肺音和噪声的混合,获得此类高质量声音并非易事。因此,需要额外的预处理将胸部声音分离为心音和肺音。本文提出了一种新颖的深度学习方法来分离此类胸部声音。受Conv-TasNet模型启发,所提模型包含编码器、解码器和掩码生成器。编码器由一维卷积模型构成,解码器由转置一维卷积构成。掩码生成器采用堆叠的一维卷积和Transformer构建。在人工数据集上,所提模型在客观失真指标上比先前方法提升了2.01 dB至5.06 dB,同时计算时间至少提升了17倍。因此,该模型可作为任何基于心音图的健康监测系统的适宜预处理步骤。