Diffusion Handles is a novel approach to enabling 3D object edits on diffusion images. We accomplish these edits using existing pre-trained diffusion models, and 2D image depth estimation, without any fine-tuning or 3D object retrieval. The edited results remain plausible, photo-real, and preserve object identity. Diffusion Handles address a critically missing facet of generative image based creative design, and significantly advance the state-of-the-art in generative image editing. Our key insight is to lift diffusion activations for an object to 3D using a proxy depth, 3D-transform the depth and associated activations, and project them back to image space. The diffusion process applied to the manipulated activations with identity control, produces plausible edited images showing complex 3D occlusion and lighting effects. We evaluate Diffusion Handles: quantitatively, on a large synthetic data benchmark; and qualitatively by a user study, showing our output to be more plausible, and better than prior art at both, 3D editing and identity control. Project Webpage: https://diffusionhandles.github.io/
翻译:扩散手柄是一种新颖的方法,旨在对扩散模型生成的图像进行三维对象编辑。我们利用已有的预训练扩散模型和二维图像深度估计实现这些编辑,无需任何微调或三维对象检索。编辑后的结果保持合理、逼真,并保留对象身份。扩散手柄解决了生成式图像创意设计中一个关键缺失的方面,并显著推动了生成式图像编辑的最新技术水平。我们的核心洞见是:利用代理深度将对象的扩散激活提升至三维,对深度及相关激活进行三维变换,再将其投影回图像空间。通过身份控制对处理后的激活应用扩散过程,可生成展现复杂三维遮挡与光照效果的合理编辑图像。我们通过大规模合成数据基准进行定量评估,并开展用户研究进行定性评估,结果表明我们的输出在三维编辑和身份控制两方面均优于现有技术,且更为合理。项目网页:https://diffusionhandles.github.io/