Conditional diffusion models can create unseen images in various settings, aiding image interpolation. Interpolation in latent spaces is well-studied, but interpolation with specific conditions like text or poses is less understood. Simple approaches, such as linear interpolation in the space of conditions, often result in images that lack consistency, smoothness, and fidelity. To that end, we introduce a novel training-free technique named Attention Interpolation via Diffusion (AID). Our key contributions include 1) proposing an inner/outer interpolated attention layer; 2) fusing the interpolated attention with self-attention to boost fidelity; and 3) applying beta distribution to selection to increase smoothness. We also present a variant, Prompt-guided Attention Interpolation via Diffusion (PAID), that considers interpolation as a condition-dependent generative process. This method enables the creation of new images with greater consistency, smoothness, and efficiency, and offers control over the exact path of interpolation. Our approach demonstrates effectiveness for conceptual and spatial interpolation. Code and demo are available at https://github.com/QY-H00/attention-interpolation-diffusion.
翻译:条件扩散模型能够在多种设定下生成未见过的图像,从而辅助图像插值。潜空间中的插值方法已得到充分研究,但针对文本或姿态等特定条件的插值仍缺乏深入理解。简单的线性条件空间插值方法常导致图像在一致性、平滑性和保真度方面存在缺陷。为此,我们提出了一种无需训练的新技术——扩散注意力插值(Attention Interpolation via Diffusion, AID)。我们的核心贡献包括:1)提出内/外插值注意力层;2)将插值注意力与自注意力融合以提升保真度;3)采用贝塔分布进行选择性增强以增加平滑性。我们还提出了变体方法——提示引导扩散注意力插值(Prompt-guided Attention Interpolation via Diffusion, PAID),将插值视为依赖于条件的生成过程。该方法能以更高的一致性、平滑性和效率生成新图像,并支持对插值路径的精确控制。我们的方法在概念插值和空间插值中均展现出有效性。代码和演示可在 https://github.com/QY-H00/attention-interpolation-diffusion 获取。