A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper. The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-filter. Both components rely on power spectral density (PSD) estimates provided by deep neural networks (DNNs). By deriving new metrics analyzing the dereverberation performance in various time ranges, we confirm that directly optimizing for a criterion at the output of the multi-channel linear filtering stage results in a more efficient dereverberation as compared to placing the criterion at the output of the DNN to optimize the PSD estimation. More concretely, we show that training this stage end-to-end helps further remove the reverberation in the range accessible to the filter, thus increasing the \textit{early-to-moderate} reverberation ratio. We argue and demonstrate that it can then be well combined with a post-filtering stage to efficiently suppress the residual late reverberation, thereby increasing the \textit{early-to-final} reverberation ratio. This proposed two stage procedure is shown to be both very effective in terms of dereverberation performance and computational demands, as compared to e.g. recent state-of-the-art DNN approaches. Furthermore, the proposed two-stage system can be adapted to the needs of different types of hearing-device users by controlling the amount of reduction of early reflections.
翻译:本文提出了一种用于助听设备的轻量化在线去混响两阶段算法。该方法将多通道多帧线性滤波器与单通道单帧后置滤波器相结合。两个模块均依赖于深度神经网络提供的功率谱密度估计。通过推导分析不同时间范围内去混响性能的新指标,我们证实:相比于将准则置于深度神经网络输出端以优化功率谱密度估计,直接对多通道线性滤波阶段输出端的准则进行优化能够实现更高效的去混响。具体而言,我们表明对该阶段进行端到端训练有助于进一步消除滤波器可及范围内的混响成分,从而提升"早期-中期"混响比。我们论证并证明,该阶段可与后置滤波阶段良好结合,有效抑制残余晚期混响,进而提升"早期-末期"混响比。与近期最先进的深度神经网络方法相比,所提出的两阶段流程在去混响性能和计算需求方面均展现出显著优势。此外,该两阶段系统可通过控制早期反射的衰减量,适应不同类型助听器用户的需求。