The essence of self-supervised image denoising is to restore the signal from the noisy image alone. State-of-the-art solutions for this task rely on the idea of masking pixels and training a fully-convolutional neural network to impute them. This most often requires multiple forward passes, information about the noise model, and intricate regularization functions. In this paper, we propose a Swin Transformer-based Image Autoencoder (SwinIA), the first convolution-free architecture for self-supervised denoising. It can be trained end-to-end with a simple mean squared error loss without masking and does not require any prior knowledge about clean data or noise distribution. Despite its simplicity, SwinIA establishes state-of-the-art on several common benchmarks.
翻译:自监督图像去噪的本质是从含噪图像中恢复信号。当前最先进的解决方案依赖于掩码像素思想,并训练全卷积神经网络对这些像素进行补全。这通常需要多次前向传播、噪声模型信息以及复杂的正则化函数。本文提出基于Swin Transformer的图像自编码器(SwinIA),这是首个用于自监督去噪的无卷积架构。该模型无需掩码即可通过简单均方误差损失进行端到端训练,且不需要任何关于干净数据或噪声分布的先验知识。尽管结构简单,SwinIA仍在多个常见基准上达到了最先进水平。