Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted. In this paper, we propose DI-AEC, pioneering a diffusion-based stochastic regeneration approach dedicated to AEC. Further, we propose FADI-AEC, fast score-based diffusion AEC framework to save computational demands, making it favorable for edge devices. It stands out by running the score model once per frame, achieving a significant surge in processing efficiency. Apart from that, we introduce a novel noise generation technique where far-end signals are utilized, incorporating both far-end and near-end signals to refine the score model's accuracy. We test our proposed method on the ICASSP2023 Microsoft deep echo cancellation challenge evaluation dataset, where our method outperforms some of the end-to-end methods and other diffusion based echo cancellation methods.
翻译:尽管扩散模型在语音增强领域展现出潜力,但其在声学回声消除(AEC)中的应用仍受到限制。本文提出DI-AEC,开创了一种专用于AEC的基于扩散的随机再生方法。进一步,我们提出FADI-AEC——一种基于快速得分的扩散AEC框架,以降低计算需求,使其适用于边缘设备。该框架每帧仅运行一次得分模型,显著提升了处理效率。此外,我们引入一种新颖的噪声生成技术,利用远端信号并结合近端与远端信号优化得分模型的精度。我们在ICASSP2023微软深度回声消除挑战评估数据集上测试了所提方法,结果表明其优于部分端到端方法及其他基于扩散的回声消除方法。