DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Diffusion models have achieved remarkable image generation quality surpassing previous generative models. However, a notable limitation of diffusion models, in comparison to GANs, is their difficulty in smoothly interpolating between two image samples, due to their highly unstructured latent space. Such a smooth interpolation is intriguing as it naturally serves as a solution for the image morphing task with many applications. In this work, we present DiffMorpher, the first approach enabling smooth and natural image interpolation using diffusion models. Our key idea is to capture the semantics of the two images by fitting two LoRAs to them respectively, and interpolate between both the LoRA parameters and the latent noises to ensure a smooth semantic transition, where correspondence automatically emerges without the need for annotation. In addition, we propose an attention interpolation and injection technique and a new sampling schedule to further enhance the smoothness between consecutive images. Extensive experiments demonstrate that DiffMorpher achieves starkly better image morphing effects than previous methods across a variety of object categories, bridging a critical functional gap that distinguished diffusion models from GANs.

翻译：摘要：扩散模型在图像生成质量上已超越以往的生成模型，展现出卓越性能。然而，相较于生成对抗网络（GANs），扩散模型的一个显著局限在于其高度非结构化的潜在空间导致难以实现两个图像样本之间的平滑插值。这种平滑插值极具吸引力，因为它天然可作为图像变形任务的解决方案，且拥有广泛的应用场景。本文提出DiffMorpher，这是首个利用扩散模型实现平滑自然图像插值的方法。我们的核心思想是：通过分别为两张图像拟合两个LoRA（低秩适应）模块来捕捉其语义信息，并在LoRA参数与潜在噪声之间进行插值，以确保语义的平滑过渡——在此过程中，对应关系会自动涌现，无需人工标注。此外，我们提出了一种注意力插值与注入技术，以及一种新的采样调度策略，以进一步增强连续图像间的平滑性。大量实验表明，DiffMorpher在多种物体类别上均取得了显著优于以往方法的图像变形效果，弥合了扩散模型与GANs之间的一项关键功能差距。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日