Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPMSolver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0.2s on 100-step DiffusionPolicy and 16s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score.
翻译:扩散模型是强大的生成模型,但采样速度缓慢,通常需要执行1000步顺序去噪才能生成一个样本。因此,大量研究致力于减少去噪步数,但这些方法会损害样本质量。本文没有采用减少去噪步数(以质量换速度)的策略,而是探索了一种正交思路:能否并行执行去噪步骤(以计算量换速度)?尽管去噪步骤具有顺序性,但我们令人惊讶地发现,通过Picard迭代,可以并行化采样过程——即先猜测未来去噪步骤的解,再通过迭代优化直至收敛。基于这一洞察,我们提出了ParaDiGMS,这是一种通过并行去噪多个步骤来加速预训练扩散模型采样的新方法。ParaDiGMS是首个支持以计算量换速度的扩散采样方法,甚至可与DDIM、DPMSolver等现有快速采样技术兼容。使用ParaDiGMS,我们在多种机器人策略和图像生成模型上实现了2-4倍的采样加速,在100步的DiffusionPolicy上达到0.2秒、在1000步的StableDiffusion-v2上达到16秒的当前最快采样速度,同时任务奖励、FID分数和CLIP分数均无显著下降。