Self-supervised video denoising has seen decent progress through the use of blind spot networks. However, under their blind spot constraints, previous self-supervised video denoising methods suffer from significant information loss and texture destruction in either the whole reference frame or neighbor frames, due to their inadequate consideration of the receptive field. Moreover, the limited number of available neighbor frames in previous methods leads to the discarding of distant temporal information. Nonetheless, simply adopting existing recurrent frameworks does not work, since they easily break the constraints on the receptive field imposed by self-supervision. In this paper, we propose RDRF for self-supervised video denoising, which not only fully exploits both the reference and neighbor frames with a denser receptive field, but also better leverages the temporal information from both local and distant neighbor features. First, towards a comprehensive utilization of information from both reference and neighbor frames, RDRF realizes a denser receptive field by taking more neighbor pixels along the spatial and temporal dimensions. Second, it features a self-supervised recurrent video denoising framework, which concurrently integrates distant and near-neighbor temporal features. This enables long-term bidirectional information aggregation, while mitigating error accumulation in the plain recurrent framework. Our method exhibits superior performance on both synthetic and real video denoising datasets. Codes will be available at https://github.com/Wang-XIaoDingdd/RDRF.
翻译:自监督视频去噪通过盲点网络取得了不错的进展。然而,在盲点约束下,现有自监督视频去噪方法因对感受野考虑不足,导致整个参考帧或相邻帧中出现显著的信息损失和纹理破坏。此外,这些方法中可用相邻帧数量有限,导致远距离时间信息被丢弃。然而,直接采用现有的循环框架并不可行,因为它们容易打破自监督对感受野施加的约束。本文提出用于自监督视频去噪的RDRF,不仅通过更密集的感受野充分利用参考帧和相邻帧,还能更好地利用来自局部和远距离相邻特征的时间信息。首先,为全面利用参考帧和相邻帧的信息,RDRF通过沿空间和时间维度获取更多相邻像素来实现密集感受野。其次,它采用自监督循环视频去噪框架,同时集成远距离和近邻时间特征,实现长距离双向信息聚合,同时缓解普通循环框架中的误差累积问题。该方法在合成和真实视频去噪数据集上均展现出优异性能。代码将发布于https://github.com/Wang-XIaoDingdd/RDRF。