We present a unified probabilistic formulation for diffusion-based image editing, where a latent variable is edited in a task-specific manner and generally deviates from the corresponding marginal distribution induced by the original stochastic or ordinary differential equation (SDE or ODE). Instead, it defines a corresponding SDE or ODE for editing. In the formulation, we prove that the Kullback-Leibler divergence between the marginal distributions of the two SDEs gradually decreases while that for the ODEs remains as the time approaches zero, which shows the promise of SDE in image editing. Inspired by it, we provide the SDE counterparts for widely used ODE baselines in various tasks including inpainting and image-to-image translation, where SDE shows a consistent and substantial improvement. Moreover, we propose SDE-Drag -- a simple yet effective method built upon the SDE formulation for point-based content dragging. We build a challenging benchmark (termed DragBench) with open-set natural, art, and AI-generated images for evaluation. A user study on DragBench indicates that SDE-Drag significantly outperforms our ODE baseline, existing diffusion-based methods, and the renowned DragGAN. Our results demonstrate the superiority and versatility of SDE in image editing and push the boundary of diffusion-based editing methods.
翻译:我们提出了一个统一的概率框架用于基于扩散模型的图像编辑,其中潜变量以任务特定方式进行编辑,通常偏离原始随机微分方程或常微分方程所诱导的相应边际分布。该框架为编辑过程定义了对应的SDE或ODE。在该框架下,我们证明当时间趋近于零时,两个SDE边际分布之间的KL散度逐渐减小,而ODE的相应散度保持不变,这揭示了SDE在图像编辑中的潜力。受此启发,我们为广泛使用的ODE基线方法提供了对应的SDE版本,涵盖图像修复和图像到图像翻译等任务,表明SDE能带来一致且显著的性能提升。此外,我们提出SDE-Drag——一种基于SDE框架的简单而有效的点对点内容拖拽方法。为进行公平评估,我们构建了一个具有挑战性的基准测试集(称为DragBench),包含开放场景下的自然图像、艺术图像和AI生成图像。在DragBench上的用户研究表明,SDE-Drag显著优于我们的ODE基线、现有基于扩散模型的方法以及著名的DragGAN。我们的结果证明了SDE在图像编辑中的优越性和通用性,并扩展了基于扩散模型编辑方法的能力边界。