Conditional diffusion models can create unseen images in various settings, aiding image interpolation. Interpolation in latent spaces is well-studied, but interpolation with specific conditions like text or poses is less understood. Simple approaches, such as linear interpolation in the space of conditions, often result in images that lack consistency, smoothness, and fidelity. To that end, we introduce a novel training-free technique named Attention Interpolation via Diffusion (AID). Our key contributions include 1) proposing an inner/outer interpolated attention layer; 2) fusing the interpolated attention with self-attention to boost fidelity; and 3) applying beta distribution to selection to increase smoothness. We also present a variant, Prompt-guided Attention Interpolation via Diffusion (PAID), that considers interpolation as a condition-dependent generative process. This method enables the creation of new images with greater consistency, smoothness, and efficiency, and offers control over the exact path of interpolation. Our approach demonstrates effectiveness for conceptual and spatial interpolation. Code and demo are available at https://github.com/QY-H00/attention-interpolation-diffusion.
翻译:条件扩散模型能够在多种设定下生成未见过的图像,从而辅助图像插值。潜在空间中的插值已有深入研究,但在文本或姿态等特定条件下的插值则较少被理解。简单的方法(如在条件空间中进行线性插值)通常会导致生成的图像缺乏一致性、平滑性和保真度。为此,我们提出了一种名为“通过扩散进行注意力插值”(AID)的新型免训练技术。我们的主要贡献包括:1)提出一种内/外插值注意力层;2)将插值注意力与自注意力融合以提升保真度;3)应用Beta分布进行选择以增加平滑性。我们还提出了一种变体——“通过扩散进行提示引导的注意力插值”(PAID),它将插值视为一个条件依赖的生成过程。该方法能够以更高的一致性、平滑性和效率生成新图像,并允许对插值的具体路径进行控制。我们的方法在概念插值与空间插值中均展示了有效性。代码与演示可在 https://github.com/QY-H00/attention-interpolation-diffusion 获取。