Echo cancellation and noise reduction are essential for full-duplex communication, yet most existing neural networks have high computational costs and are inflexible in tuning model complexity. In this paper, we introduce time-frequency dual-path compression to achieve a wide range of compression ratios on computational cost. Specifically, for frequency compression, trainable filters are used to replace manually designed filters for dimension reduction. For time compression, only using frame skipped prediction causes large performance degradation, which can be alleviated by a post-processing network with full sequence modeling. We have found that under fixed compression ratios, dual-path compression combining both the time and frequency methods will give further performance improvement, covering compression ratios from 4x to 32x with little model size change. Moreover, the proposed models show competitive performance compared with fast FullSubNet and DeepFilterNet. A demo page can be found at hangtingchen.github.io/ultra_dual_path_compression.github.io/.
翻译:回声消除和噪声抑制是全双工通信中不可或缺的技术,然而现有的大部分神经网络计算成本较高,且在调整模型复杂度方面缺乏灵活性。本文引入时频双路径压缩,以实现计算成本的大范围压缩比。具体而言,在频率压缩方面,采用可训练滤波器替代人工设计的滤波器进行维度缩减;在时间压缩方面,仅使用帧跳过预测会导致显著的性能下降,而这一问题可通过引入全序列建模的后处理网络得到缓解。我们发现在固定压缩比下,结合时间与频率方法的双路径压缩能进一步提升性能,可在模型尺寸几乎不变的情况下覆盖4倍至32倍的压缩比范围。此外,所提出的模型与快速FullSubNet及DeepFilterNet相比表现出竞争性性能。演示页面请访问 hangtingchen.github.io/ultra_dual_path_compression.github.io/。