We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps while maintaining high image quality. We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal in combination with an adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps. Our analyses show that our model clearly outperforms existing few-step methods (GANs, Latent Consistency Models) in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps. ADD is the first method to unlock single-step, real-time image synthesis with foundation models. Code and weights available under https://github.com/Stability-AI/generative-models and https://huggingface.co/stabilityai/ .
翻译:我们提出对抗性扩散蒸馏(ADD),一种新颖的训练方法,能够在仅1-4步内高效采样大型基础图像扩散模型,同时保持高图像质量。我们利用分数蒸馏,结合大型现成图像扩散模型作为教师信号,并引入对抗性损失,以确保即使在单步或两步的低采样步数设置下仍能实现高图像保真度。分析表明,我们的模型在单步中明显优于现有的少步方法(GANs、潜在一致性模型),仅需四步即可达到最先进扩散模型(SDXL)的性能。ADD是首个解锁基础模型单步实时图像合成的方法。代码和权重可在https://github.com/Stability-AI/generative-models 和 https://huggingface.co/stabilityai/ 获取。