In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose a deep learning approach, called Deep AHS, to address it. Deep AHS is trained in a teacher forcing way which converts the recurrent howling suppression process into an instantaneous speech separation process to simplify the problem and accelerate the model training. The proposed method utilizes properly designed features and trains an attention based recurrent neural network (RNN) to extract the target signal from the microphone recording, thus attenuating the playback signal that may lead to howling. Different training strategies are investigated and a streaming inference method implemented in a recurrent mode used to evaluate the performance of the proposed method for real-time howling suppression. Deep AHS avoids howling detection and intrinsically prohibits howling from happening, allowing for more flexibility in the design of audio systems. Experimental results show the effectiveness of the proposed method for howling suppression under different scenarios.
翻译:本文将声学啸叫抑制(Acoustic Howling Suppression, AHS)问题建模为监督学习任务,并提出一种名为Deep AHS的深度学习方法加以解决。该方法采用教师强制(teacher forcing)训练策略,将递归的啸叫抑制过程转化为瞬时语音分离问题,从而简化问题并加速模型训练。所提方法利用精心设计的特征,训练基于注意力机制的循环神经网络(RNN),从麦克风录音中提取目标信号,进而衰减可能导致啸叫的播放信号。本文研究了多种训练策略,并采用循环模式的流式推理方法评估所提方法在实时啸叫抑制中的性能。Deep AHS无需进行啸叫检测,从本质上抑制啸叫的发生,为音频系统设计提供了更高的灵活性。实验结果表明,该方法在不同场景下均能有效实现啸叫抑制。