Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPMSolver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0.2s on 100-step DiffusionPolicy and 16s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score.
翻译:扩散模型是强大的生成模型,但采样速度慢,通常需要对一个样本进行1000步顺序去噪。因此,大量研究致力于减少去噪步数,但这些方法会损害样本质量。本文不采用减少去噪步数(以质量换速度)的方法,而是探索一种正交思路:能否并行运行去噪步骤(以计算换速度)?尽管去噪步骤具有顺序特性,但我们证明,通过Picard迭代,可以惊人地实现并行化采样:先猜测未来去噪步骤的解,再通过迭代精化直至收敛。基于这一洞察,我们提出ParaDiGMS——一种通过并行去噪多个步骤来加速预训练扩散模型采样的新方法。ParaDiGMS是首个支持以计算换速度的扩散采样方法,甚至与DDIM、DPMSolver等现有快速采样技术兼容。使用ParaDiGMS,我们在多个机器人和图像生成模型上将采样速度提升2-4倍,在100步的DiffusionPolicy上实现0.2秒、在1000步的StableDiffusion-v2上实现16秒的最快采样速度,而任务奖励、FID分数和CLIP分数均未出现可测量的下降。