Conditional diffusion models can create unseen images in various settings, aiding image interpolation. Interpolation in latent spaces is well-studied, but interpolation with specific conditions like text or poses is less understood. Simple approaches, such as linear interpolation in the space of conditions, often result in images that lack consistency, smoothness, and fidelity. To that end, we introduce a novel training-free technique named Attention Interpolation via Diffusion (AID). Our key contributions include 1) proposing an inner/outer interpolated attention layer; 2) fusing the interpolated attention with self-attention to boost fidelity; and 3) applying beta distribution to selection to increase smoothness. We also present a variant, Prompt-guided Attention Interpolation via Diffusion (PAID), that considers interpolation as a condition-dependent generative process. This method enables the creation of new images with greater consistency, smoothness, and efficiency, and offers control over the exact path of interpolation. Our approach demonstrates effectiveness for conceptual and spatial interpolation. Code and demo are available at https://github.com/QY-H00/attention-interpolation-diffusion.
翻译:条件扩散模型能够在多种场景下创建未见过的图像,从而辅助图像插值。尽管潜在空间中的插值已得到充分研究,但在文本或姿态等特定条件下的插值仍未被充分探索。简单的插值方法(如条件空间中的线性插值)通常会导致图像缺乏一致性、平滑性和保真度。为此,我们提出了一种无需训练的新技术——基于扩散的注意力插值(AID)。我们的主要贡献包括:1)提出内/外插值注意力层;2)将插值注意力与自注意力融合以提升保真度;3)应用贝塔分布进行选择以增加平滑性。我们还提出一种变体——基于提示引导的扩散注意力插值(PAID),该方法将插值视为依赖于条件的生成过程。该技术能够生成具有更高一致性、平滑性和效率的新图像,并可控制插值的精确路径。我们的方法在概念插值和空间插值中均展现出有效性。代码和演示可在 https://github.com/QY-H00/attention-interpolation-diffusion 获取。