Drag-based image editing using generative models provides precise control over image contents, enabling users to manipulate anything in an image with a few clicks. However, prevailing methods typically adopt $n$-step iterations for latent semantic optimization to achieve drag-based image editing, which is time-consuming and limits practical applications. In this paper, we introduce a novel one-step drag-based image editing method, i.e., FastDrag, to accelerate the editing process. Central to our approach is a latent warpage function (LWF), which simulates the behavior of a stretched material to adjust the location of individual pixels within the latent space. This innovation achieves one-step latent semantic optimization and hence significantly promotes editing speeds. Meanwhile, null regions emerging after applying LWF are addressed by our proposed bilateral nearest neighbor interpolation (BNNI) strategy. This strategy interpolates these regions using similar features from neighboring areas, thus enhancing semantic integrity. Additionally, a consistency-preserving strategy is introduced to maintain the consistency between the edited and original images by adopting semantic information from the original image, saved as key and value pairs in self-attention module during diffusion inversion, to guide the diffusion sampling. Our FastDrag is validated on the DragBench dataset, demonstrating substantial improvements in processing time over existing methods, while achieving enhanced editing performance. Project page: https://fastdrag-site.github.io/ .
翻译:基于生成模型的拖拽式图像编辑技术为用户提供了对图像内容的精确控制,使用户能够通过少量点击操作即可编辑图像中的任意元素。然而,现有方法通常需要采用$n$步迭代的潜在语义优化来实现拖拽式编辑,这一过程耗时较长,限制了其实际应用。本文提出了一种新颖的一步式拖拽图像编辑方法,即FastDrag,以加速编辑流程。本方法的核心是潜在空间形变函数,该函数通过模拟拉伸材料的行为来调整潜在空间中单个像素的位置。这一创新实现了一步式的潜在语义优化,从而显著提升了编辑速度。同时,应用LWF后产生的空白区域通过我们提出的双边最近邻插值策略进行处理。该策略利用相邻区域的相似特征对这些区域进行插值,从而增强了语义完整性。此外,本文引入了一种保持一致性的策略,通过采用原始图像的语义信息(在扩散逆过程中以键值对形式保存在自注意力模块中)来引导扩散采样,从而保持编辑后图像与原始图像的一致性。我们在DragBench数据集上验证了FastDrag的有效性,结果表明该方法在处理时间上相比现有方法有显著提升,同时实现了更优的编辑性能。项目页面:https://fastdrag-site.github.io/。