Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising

Image denoising is a fundamental and challenging task in the field of computer vision. Most supervised denoising methods learn to reconstruct clean images from noisy inputs, which have intrinsic spectral bias and tend to produce over-smoothed and blurry images. Recently, researchers have explored diffusion models to generate high-frequency details in image restoration tasks, but these models do not guarantee that the generated texture aligns with real images, leading to undesirable artifacts. To address the trade-off between visual appeal and fidelity of high-frequency details in denoising tasks, we propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG). Our method leverages a reconstructive denoising network to recover the majority of the underlying clean signal, which serves as the initial estimation for subsequent steps to maintain fidelity. Additionally, it employs a diffusion algorithm to generate residual high-frequency details, thereby enhancing visual quality. We further introduce a two-stage training scheme to ensure effective collaboration between the reconstructive and generative modules of RnG. To reduce undesirable texture introduced by the diffusion model, we also propose an adaptive step controller that regulates the number of inverse steps applied by the diffusion model, allowing control over the level of high-frequency details added to each patch as well as saving the inference computational cost. Through our proposed RnG, we achieve a better balance between perception and distortion. We conducted extensive experiments on both synthetic and real denoising datasets, validating the superiority of the proposed approach.

翻译：图像去噪是计算机视觉领域一项基础且具有挑战性的任务。大多数有监督去噪方法学习从含噪输入重建干净图像，这些方法存在固有频谱偏差，倾向于产生过度平滑和模糊的图像。近年来，研究者探索利用扩散模型在图像恢复任务中生成高频细节，但这些模型无法保证生成纹理与真实图像一致，导致出现不良伪影。为平衡去噪任务中高频细节的视觉吸引力与保真度，我们提出一种名为"重建与生成扩散模型"（RnG）的新方法。该方法利用重建式去噪网络恢复大部分潜在干净信号，该信号作为后续步骤的初始估计以维持保真度；同时采用扩散算法生成残余高频细节，从而提升视觉质量。我们进一步提出两阶段训练方案，确保RnG中重建模块与生成模块的有效协作。为减少扩散模型引入的不良纹理，我们提出自适应步长控制器，通过调节扩散模型应用的反向步数，既控制各图像块添加的高频细节水平，又节省推理计算成本。通过提出的RnG方法，我们在感知质量与失真度之间实现了更优平衡。我们在合成噪声与真实噪声数据集上开展了大量实验，验证了所提方法的优越性。