Deep learning has the potential to enhance speech signals and increase their intelligibility for users of hearing aids. Deep models suited for real-world application should feature a low computational complexity and low processing delay of only a few milliseconds. In this paper, we explore deep speech enhancement that matches these requirements and contrast monaural and binaural processing algorithms in two complex acoustic scenes. Both algorithms are evaluated with objective metrics and in experiments with hearing-impaired listeners performing a speech-in-noise test. Results are compared to two traditional enhancement strategies, i.e., adaptive differential microphone processing and binaural beamforming. While in diffuse noise, all algorithms perform similarly, the binaural deep learning approach performs best in the presence of spatial interferers. Through a post-analysis, this can be attributed to improvements at low SNRs and to precise spatial filtering.
翻译:深度学习具有增强语音信号并提高助听器使用者语音清晰度的潜力。适用于实际应用的深度模型应具备低计算复杂度和仅几毫秒的低处理延迟。本文探索了符合这些要求的深度语音增强方法,并在两种复杂声学场景中对比了单声道与双声道处理算法。两种算法通过客观指标以及在听力受损听众进行的噪声中语音测试实验进行评估。结果与两种传统增强策略(即自适应差分麦克风处理和双声道波束成形)进行了比较。在扩散噪声中,所有算法表现相似,而双声道深度学习方法在存在空间干扰源时表现最佳。通过后续分析,这归因于低信噪比下的改进以及精确的空间滤波能力。