Diffusion models (DMs) have been successfully applied to real image editing. These models typically invert images into latent noise vectors used to reconstruct the original images (known as inversion), and then edit them during the inference process. However, recent popular DMs often rely on the assumption of local linearization, where the noise injected during the inversion process is expected to approximate the noise removed during the inference process. While DM efficiently generates images under this assumption, it can also accumulate errors during the diffusion process due to the assumption, ultimately negatively impacting the quality of real image reconstruction and editing. To address this issue, we propose a novel method, referred to as ERDDCI (Exact Reversible Diffusion via Dual-Chain Inversion). ERDDCI uses the new Dual-Chain Inversion (DCI) for joint inference to derive an exact reversible diffusion process. By using DCI, our method effectively avoids the cumbersome optimization process in existing inversion approaches and achieves high-quality image editing. Additionally, to accommodate image operations under high guidance scales, we introduce a dynamic control strategy that enables more refined image reconstruction and editing. Our experiments demonstrate that ERDDCI significantly outperforms state-of-the-art methods in a 50-step diffusion process. It achieves rapid and precise image reconstruction with an SSIM of 0.999 and an LPIPS of 0.001, and also delivers competitive results in image editing.
翻译:扩散模型已成功应用于真实图像编辑。这些模型通常将图像反演为用于重建原始图像的潜在噪声向量(称为反演),然后在推理过程中进行编辑。然而,当前流行的扩散模型往往依赖于局部线性化假设,即期望反演过程中注入的噪声能够近似推理过程中移除的噪声。虽然扩散模型在该假设下能高效生成图像,但该假设也可能导致扩散过程中误差累积,最终对真实图像重建与编辑的质量产生负面影响。为解决此问题,我们提出一种称为ERDDCI(基于双链反转的精确可逆扩散)的新方法。ERDDCI采用新型双链反转进行联合推理,从而推导出精确可逆的扩散过程。通过使用双链反转,我们的方法有效避免了现有反演方法中繁琐的优化过程,实现了高质量图像编辑。此外,为适应高引导尺度下的图像操作,我们引入动态控制策略,以实现更精细的图像重建与编辑。实验表明,ERDDCI在50步扩散过程中显著优于现有最优方法。该方法能以0.999的结构相似性指数与0.001的感知图像块相似度实现快速精确的图像重建,同时在图像编辑任务中取得具有竞争力的结果。