Denoising diffusion models have recently shown impressive results in generative tasks. By learning powerful priors from huge collections of training images, such models are able to gradually modify complete noise to a clean natural image via a sequence of small denoising steps, seemingly making them well-suited for single image denoising. However, effectively applying denoising diffusion models to removal of realistic noise is more challenging than it may seem, since their formulation is based on additive white Gaussian noise, unlike noise in real-world images. In this work, we present SVNR, a novel formulation of denoising diffusion that assumes a more realistic, spatially-variant noise model. SVNR enables using the noisy input image as the starting point for the denoising diffusion process, in addition to conditioning the process on it. To this end, we adapt the diffusion process to allow each pixel to have its own time embedding, and propose training and inference schemes that support spatially-varying time maps. Our formulation also accounts for the correlation that exists between the condition image and the samples along the modified diffusion process. In our experiments we demonstrate the advantages of our approach over a strong diffusion model baseline, as well as over a state-of-the-art single image denoising method.
翻译:摘要:去噪扩散模型近年来在生成任务中展现出令人瞩目的成果。通过学习海量训练图像中的强大先验,此类模型能通过一系列微小的去噪步骤,将完全噪声逐步转化为干净的自然图像,这使得它们看似非常适合单图像去噪任务。然而,将去噪扩散模型有效应用于真实噪声去除比表面看起来更具挑战性,因为其公式基于加性高斯白噪声,而非真实世界图像中的噪声。本文提出SVNR——一种基于更真实空间变异噪声模型的去噪扩散新公式。SVNR不仅以含噪输入图像作为去噪扩散过程的起点,还将其作为条件约束强化该过程。为此,我们调整扩散过程使每个像素拥有独立的时间嵌入,并提出支持空间变化时间映射的训练与推理方案。我们的公式还考虑了条件图像与修正扩散过程中样本之间的相关性。实验表明,本方法在强扩散模型基线及当前最先进的单图像去噪方法上均具有显著优势。