In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose a deep learning approach, called Deep AHS, to address it. Deep AHS is trained in a teacher forcing way which converts the recurrent howling suppression process into an instantaneous speech separation process to simplify the problem and accelerate the model training. The proposed method utilizes properly designed features and trains an attention based recurrent neural network (RNN) to extract the target signal from the microphone recording, thus attenuating the playback signal that may lead to howling. Different training strategies are investigated and a streaming inference method implemented in a recurrent mode used to evaluate the performance of the proposed method for real-time howling suppression. Deep AHS avoids howling detection and intrinsically prohibits howling from happening, allowing for more flexibility in the design of audio systems. Experimental results show the effectiveness of the proposed method for howling suppression under different scenarios.
翻译:本文系统性地将声啸叫抑制(Acoustic Howling Suppression, AHS)形式化为监督学习问题,并提出一种名为深度声反馈抑制(Deep AHS)的深度学习方法加以解决。该模型采用教师强制(teacher forcing)训练机制,将循环迭代的啸叫抑制过程转化为瞬态语音分离任务,从而降低问题复杂度并加速模型训练。所提方法利用精心设计的声学特征,训练基于注意力机制的循环神经网络(Attention-based RNN),从麦克风录音中提取目标信号,进而衰减可能导致啸叫的播放信号。研究对比了不同训练策略,并采用基于循环模式的流式推理方法,评估该模型在实时啸叫抑制场景下的性能。Deep AHS无需显式啸叫检测即可从根源上抑制啸叫产生,为音频系统设计提供更大灵活性。实验结果表明,该方法在多种场景下均能有效实现啸叫抑制。