While diffusion-based image restoration (IR) methods have achieved remarkable success, they are still limited by the low inference speed attributed to the necessity of executing hundreds or even thousands of sampling steps. Existing acceleration sampling techniques, though seeking to expedite the process, inevitably sacrifice performance to some extent, resulting in over-blurry restored outcomes. To address this issue, this study proposes a novel and efficient diffusion model for IR that significantly reduces the required number of diffusion steps. Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration. Specifically, our proposed method establishes a Markov chain that facilitates the transitions between the high-quality and low-quality images by shifting their residuals, substantially improving the transition efficiency. A carefully formulated noise schedule is devised to flexibly control the shifting speed and the noise strength during the diffusion process. Extensive experimental evaluations demonstrate that the proposed method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks, namely image super-resolution, image inpainting, and blind face restoration, \textit{\textbf{even only with four sampling steps}}. Our code and model are publicly available at \url{https://github.com/zsyOAOA/ResShift}.
翻译:尽管基于扩散模型的图像复原方法已取得显著成功,但其推理速度仍受限于需执行数百甚至数千次采样步骤。现有的加速采样技术虽致力于加快过程,但不可避免地会在一定程度上牺牲性能,导致复原结果过度模糊。为解决此问题,本研究提出一种新颖高效的图像复原扩散模型,可显著减少所需扩散步数。该方法避免了推理过程中的后置加速需求,从而规避了相关的性能退化问题。具体而言,所提方法通过构建一个马尔可夫链,借助残差偏移实现高质量图像与低质量图像之间的高效转换,大幅提升了转换效率。研究还设计了精心构建的噪声调度机制,以灵活控制扩散过程中的偏移速度与噪声强度。大量实验评估表明,在图像超分辨率、图像修复和盲人脸复原这三个经典图像复原任务上,所提方法即使仅使用四次采样步骤,也能达到与当前最先进方法相当或更优的性能。我们的代码与模型已公开于 \url{https://github.com/zsyOAOA/ResShift}。