We present a unified probabilistic formulation for diffusion-based image editing, where a latent variable is edited in a task-specific manner and generally deviates from the corresponding marginal distribution induced by the original stochastic or ordinary differential equation (SDE or ODE). Instead, it defines a corresponding SDE or ODE for editing. In the formulation, we prove that the Kullback-Leibler divergence between the marginal distributions of the two SDEs gradually decreases while that for the ODEs remains as the time approaches zero, which shows the promise of SDE in image editing. Inspired by it, we provide the SDE counterparts for widely used ODE baselines in various tasks including inpainting and image-to-image translation, where SDE shows a consistent and substantial improvement. Moreover, we propose SDE-Drag -- a simple yet effective method built upon the SDE formulation for point-based content dragging. We build a challenging benchmark (termed DragBench) with open-set natural, art, and AI-generated images for evaluation. A user study on DragBench indicates that SDE-Drag significantly outperforms our ODE baseline, existing diffusion-based methods, and the renowned DragGAN. Our results demonstrate the superiority and versatility of SDE in image editing and push the boundary of diffusion-based editing methods.
翻译:我们提出了一个统一的概率框架用于扩散模型图像编辑,其中潜变量根据具体任务进行编辑,通常偏离原始随机微分方程或常微分方程(SDE或ODE)所诱导的对应边际分布。该框架为编辑过程定义了对应的SDE或ODE。在该框架中,我们证明随着时间趋近于零,两个SDE边际分布之间的Kullback-Leibler散度逐渐减小,而ODE的则保持不变,这揭示了SDE在图像编辑中的优势。受此启发,我们为各类任务(包括图像修复和图像到图像翻译)中广泛使用的ODE基线提供了对应的SDE版本,其中SDE展现出持续且显著的改进。此外,我们提出了SDE-Drag——一种基于SDE框架实现点式内容拖动的简单而有效的方法。我们构建了一个具有挑战性的基准测试(称为DragBench),涵盖开放集自然图像、艺术图像和AI生成图像用于评估。在DragBench上的用户研究表明,SDE-Drag显著优于我们的ODE基线、现有基于扩散的方法以及著名的DragGAN。我们的结果证明了SDE在图像编辑中的优越性和通用性,并拓展了基于扩散编辑方法的边界。