The successful deployment of deep learning-based acoustic echo and noise reduction (AENR) methods in consumer devices has spurred interest in developing low-complexity solutions, while emphasizing the need for robust performance in real-life applications. In this work, we propose a hybrid approach to enhance the state-of-the-art (SOTA) ULCNet model by integrating time alignment and parallel encoder blocks for the model inputs, resulting in better echo reduction and comparable noise reduction performance to existing SOTA methods. We also propose a channel-wise sampling-based feature reorientation method, ensuring robust performance across many challenging scenarios, while maintaining overall low computational and memory requirements.
翻译:基于深度学习的声学回声与噪声消除(AENR)方法在消费电子设备中的成功部署,激发了开发低复杂度解决方案的兴趣,同时突显了实际应用中对鲁棒性能的需求。本研究提出一种混合方法,通过为模型输入集成时间对齐与并行编码器模块,以增强当前最先进的ULCNet模型,从而在实现更优回声消除的同时,保持与现有最先进方法相当的噪声消除性能。我们还提出一种基于通道采样的特征重定向方法,确保模型在多种挑战性场景下均能保持鲁棒性能,同时维持整体较低的计算与内存需求。