A precise and user-friendly manipulation of image content while preserving image fidelity has always been crucial to the field of image editing. Thanks to the power of generative models, recent point-based image editing methods allow users to interactively change the image content with high generalizability by clicking several control points. But the above mentioned editing process is usually based on the assumption that features stay constant in the motion supervision step from initial to target points. In this work, we conduct a comprehensive investigation in the feature space of diffusion models, and find that features change acutely under in-plane rotation. Based on this, we propose a novel approach named RotationDrag, which significantly improves point-based image editing performance when users intend to in-plane rotate the image content. Our method tracks handle points more precisely by utilizing the feature map of the rotated images, thus ensuring precise optimization and high image fidelity. Furthermore, we build a in-plane rotation focused benchmark called RotateBench, the first benchmark to evaluate the performance of point-based image editing method under in-plane rotation scenario on both real images and generated images. A thorough user study demonstrates the superior capability in accomplishing in-plane rotation that users intend to achieve, comparing the DragDiffusion baseline and other existing diffusion-based methods. See the project page https://github.com/Tony-Lowe/RotationDrag for code and experiment results.
翻译:在保持图像保真度的同时实现精准且用户友好的图像内容操控一直是图像编辑领域的核心需求。得益于生成模型的强大能力,近期的点驱动图像编辑方法允许用户通过点击若干控制点,以高泛化性交互式地修改图像内容。然而,上述编辑过程通常基于一个假设:在从初始点到目标点的运动监督步骤中,特征保持不变。本文对扩散模型的特征空间进行了系统研究,发现特征在平面内旋转下会发生剧烈变化。基于此,我们提出了一种名为RotationDrag的新方法,该方法在用户意图对图像内容进行平面内旋转时,显著提升了点驱动图像编辑的性能。我们的方法通过利用旋转后图像的特征图更精确地追踪控制点,从而确保精准优化和高图像保真度。此外,我们构建了一个专注于平面内旋转的基准测试RotateBench,这是首个在真实图像与生成图像场景下评估点驱动图像编辑方法平面内旋转性能的基准。通过详尽的用户研究,与DragDiffusion基线及其他现有基于扩散的方法相比,我们的方法在实现用户意图的平面内旋转方面展现出卓越能力。代码与实验结果请访问项目页面:https://github.com/Tony-Lowe/RotationDrag。