Blind-spot networks (BSN) have been prevalent network architectures in self-supervised image denoising (SSID). Existing BSNs are mostly conducted with convolution layers. Although transformers offer potential solutions to the limitations of convolutions and have demonstrated success in various image restoration tasks, their attention mechanisms may violate the blind-spot requirement, thus restricting their applicability in SSID. In this paper, we present a transformer-based blind-spot network (TBSN) by analyzing and redesigning the transformer operators that meet the blind-spot requirement. Specifically, TBSN follows the architectural principles of dilated BSNs, and incorporates spatial as well as channel self-attention layers to enhance the network capability. For spatial self-attention, an elaborate mask is applied to the attention matrix to restrict its receptive field, thus mimicking the dilated convolution. For channel self-attention, we observe that it may leak the blind-spot information when the channel number is greater than spatial size in the deep layers of multi-scale architectures. To eliminate this effect, we divide the channel into several groups and perform channel attention separately. Furthermore, we introduce a knowledge distillation strategy that distills TBSN into smaller denoisers to improve computational efficiency while maintaining performance. Extensive experiments on real-world image denoising datasets show that TBSN largely extends the receptive field and exhibits favorable performance against state-of-the-art SSID methods. The code and pre-trained models will be publicly available at https://github.com/nagejacob/TBSN.
翻译:盲点网络(BSN)已成为自监督图像去噪(SSID)中主流的网络架构。现有BSN大多采用卷积层实现。尽管Transformer模型为克服卷积的局限性提供了潜在解决方案,并在各类图像恢复任务中展现出成功应用,但其注意力机制可能违背盲点约束,从而限制了其在SSID中的适用性。本文通过分析并重新设计满足盲点约束的Transformer算子,提出了一种基于Transformer的盲点网络(TBSN)。具体而言,TBSN遵循扩张盲点网络的架构原则,同时引入空间自注意力层与通道自注意力层以增强网络能力。在空间自注意力方面,通过精心设计的掩码作用于注意力矩阵以约束其感受野,从而模拟扩张卷积。在通道自注意力方面,我们观察到当多尺度架构深层中通道数大于空间尺寸时,可能泄露盲点信息。为消除此影响,我们将通道划分为若干组并分别进行通道自注意力处理。此外,我们引入知识蒸馏策略,将TBSN蒸馏为更小型的去噪器,在保持性能的同时提升计算效率。在真实世界图像去噪数据集上的大量实验表明,TBSN显著扩展了感受野,并展现出优于现有最优SSID方法的性能。代码与预训练模型将在https://github.com/nagejacob/TBSN公开。