In recent years, diffusion models have gained popularity for their ability to generate higher-quality images in comparison to GAN models. However, like any other large generative models, these models require a huge amount of data, computational resources, and meticulous tuning for successful training. This poses a significant challenge, rendering it infeasible for most individuals. As a result, the research community has devised methods to leverage pre-trained unconditional diffusion models with additional guidance for the purpose of conditional image generative. These methods enable conditional image generations on diverse inputs and, most importantly, circumvent the need for training the diffusion model. In this paper, our objective is to reduce the time-required and computational overhead introduced by the addition of guidance in diffusion models -- while maintaining comparable image quality. We propose a set of methods based on our empirical analysis, demonstrating a reduction in computation time by approximately threefold.
翻译:近年来,扩散模型因其生成图像质量优于生成对抗网络(GAN)模型而受到广泛关注。然而,与其他大型生成模型类似,扩散模型需要海量数据、大量计算资源以及精细调参才能成功训练。这一挑战使得大多数研究者难以独立实现。为此,研究界开发了利用额外引导机制对预训练无条件扩散模型进行条件图像生成的方法。这些方法能基于多样化输入实现条件图像生成,且最关键的是无需重新训练扩散模型。本文旨在——在保持可比图像质量的前提下——降低扩散模型中引入引导机制所带来的时间与计算开销。基于实证分析,我们提出一系列方法,实验证明可将计算时间缩短约三倍。