In recent years, diffusion models have become the most popular and powerful methods in the field of image synthesis, even rivaling human artists in artistic creativity. However, the key issue currently limiting the application of diffusion models is its extremely slow generation process. Although several methods were proposed to speed up the generation process, there still exists a trade-off between efficiency and quality. In this paper, we first provide a detailed theoretical and empirical analysis of the generation process of the diffusion models based on schedulers. We transform the designing problem of schedulers into the determination of several parameters, and further transform the accelerated generation process into an expansion process of the linear subspace. Based on these analyses, we consequently propose a novel method called Optimal Linear Subspace Search (OLSS), which accelerates the generation process by searching for the optimal approximation process of the complete generation process in the linear subspaces spanned by latent variables. OLSS is able to generate high-quality images with a very small number of steps. To demonstrate the effectiveness of our method, we conduct extensive comparative experiments on open-source diffusion models. Experimental results show that with a given number of steps, OLSS can significantly improve the quality of generated images. Using an NVIDIA A100 GPU, we make it possible to generate a high-quality image by Stable Diffusion within only one second without other optimization techniques.
翻译:近年来,扩散模型已成为图像合成领域最流行且最强大的方法,其艺术创造力甚至可与人类艺术家媲美。然而,当前限制扩散模型应用的关键问题在于其极其缓慢的生成过程。尽管已有多种方法被提出以加速生成过程,但效率与质量之间仍存在权衡。本文首先基于调度器对扩散模型的生成过程进行了详细的理论与实证分析。我们将调度器的设计问题转化为若干参数的确定问题,并进一步将加速生成过程转化为线性子空间的扩展过程。基于这些分析,我们提出了一种名为最优线性子空间搜索(OLSS)的新方法,该方法通过在潜变量张成的线性子空间中搜索完整生成过程的最优逼近过程来加速生成。OLSS能够以极少的步数生成高质量图像。为验证方法的有效性,我们在开源扩散模型上开展了大量对比实验。实验结果表明,在给定步数条件下,OLSS能显著提升生成图像质量。借助NVIDIA A100 GPU,本方法可在无需其他优化技术的情况下,使Stable Diffusion模型在一秒内生成高质量图像。