Guided diffusion is a technique for conditioning the output of a diffusion model at sampling time without retraining the network for each specific task. One drawback of diffusion models, however, is their slow sampling process. Recent techniques can accelerate unguided sampling by applying high-order numerical methods to the sampling process when viewed as differential equations. On the contrary, we discover that the same techniques do not work for guided sampling, and little has been explored about its acceleration. This paper explores the culprit of this problem and provides a solution based on operator splitting methods, motivated by our key finding that classical high-order numerical methods are unsuitable for the conditional function. Our proposed method can re-utilize the high-order methods for guided sampling and can generate images with the same quality as a 250-step DDIM baseline using 32-58% less sampling time on ImageNet256. We also demonstrate usage on a wide variety of conditional generation tasks, such as text-to-image generation, colorization, inpainting, and super-resolution.
翻译:引导扩散是一种技术在采样时对扩散模型的输出进行条件控制,而无需为每个特定任务重新训练网络。然而,扩散模型的一个缺点是采样过程缓慢。近期技术通过将采样过程视为微分方程并应用高阶数值方法,可加速无引导采样。与之相反,我们发现相同技术不适用于引导采样,且其加速问题鲜有研究。本文探究了该问题的根源,并基于算子分裂方法提出解决方案——我们的关键发现是经典高阶数值方法不适用于条件函数。所提方法可重新利用高阶方法于引导采样,在ImageNet256上生成与250步DDIM基准相同质量的图像,同时节省32-58%的采样时间。我们还展示了该方法在多种条件生成任务中的应用,例如文本到图像生成、着色、图像修复和超分辨率。