In recent years, it has become popular to tackle image restoration tasks with a single pretrained diffusion model (DM) and data-fidelity guidance, instead of training a dedicated deep neural network per task. However, such "zero-shot" restoration schemes currently require many Neural Function Evaluations (NFEs) for performing well, which may be attributed to the many NFEs needed in the original generative functionality of the DMs. Recently, faster variants of DMs have been explored for image generation. These include Consistency Models (CMs), which can generate samples via a couple of NFEs. However, existing works that use guided CMs for restoration still require tens of NFEs or fine-tuning of the model per task that leads to performance drop if the assumptions during the fine-tuning are not accurate. In this paper, we propose a zero-shot restoration scheme that uses CMs and operates well with as little as 4 NFEs. It is based on a wise combination of several ingredients: better initialization, back-projection guidance, and above all a novel noise injection mechanism. We demonstrate the advantages of our approach for image super-resolution, deblurring and inpainting. Interestingly, we show that the usefulness of our noise injection technique goes beyond CMs: it can also mitigate the performance degradation of existing guided DM methods when reducing their NFE count.
翻译:近年来,利用单一预训练扩散模型(DM)结合数据保真度引导来处理图像复原任务(而非针对每个任务训练专用深度神经网络)的方法日益流行。然而,此类"零样本"复原方案目前需要大量神经函数评估(NFE)才能达到良好效果,这可能源于DM原始生成功能本身所需的大量NFE计算。近期,研究者开始探索用于图像生成的快速DM变体,其中包括仅需数次NFE即可生成样本的一致性模型(CM)。然而,现有采用引导CM进行复原的研究仍需要数十次NFE,或需针对每个任务对模型进行微调——若微调过程中的假设不准确,则会导致性能下降。本文提出一种基于CM的零样本复原方案,该方案仅需4次NFE即可实现优异性能。其核心在于巧妙融合多项技术要素:更优的初始化策略、反投影引导机制,以及最关键的新型噪声注入方法。我们在图像超分辨率、去模糊和修复任务中验证了该方法的优势。值得注意的是,本文提出的噪声注入技术不仅适用于CM:当减少现有引导DM方法的NFE次数时,该技术同样能有效缓解其性能退化问题。