Recently, diffusion models have achieved significant advances in vision, text, and robotics. However, they still face slow generation speeds due to sequential denoising processes. To address this, a parallel sampling method based on Picard iteration was introduced, effectively reducing sequential steps while ensuring exact convergence to the original output. Nonetheless, Picard iteration does not guarantee faster convergence, which can still result in slow generation in practice. In this work, we propose a new parallelization scheme, the Picard Consistency Model (PCM), which significantly reduces the number of generation steps in Picard iteration. Inspired by the consistency model, PCM is directly trained to predict the fixed-point solution, or the final output, at any stage of the convergence trajectory. Additionally, we introduce a new concept called model switching, which addresses PCM's limitations and ensures exact convergence. Extensive experiments demonstrate that PCM achieves up to a 2.71x speedup over sequential sampling and a 1.77x speedup over Picard iteration across various tasks, including image generation and robotic control.
翻译:近年来,扩散模型在视觉、文本和机器人领域取得了显著进展。然而,由于顺序去噪过程,其生成速度仍然较慢。为解决这一问题,一种基于皮卡德迭代的并行采样方法被提出,有效减少了顺序步骤,同时确保精确收敛至原始输出。然而,皮卡德迭代并不能保证更快的收敛速度,这在实际应用中仍可能导致生成缓慢。本文提出了一种新的并行化方案——皮卡德一致性模型(PCM),该模型显著减少了皮卡德迭代中的生成步骤数。受一致性模型启发,PCM被直接训练用于预测收敛轨迹任意阶段的不动点解,即最终输出。此外,我们引入了一种称为模型切换的新概念,以解决PCM的局限性并确保精确收敛。大量实验表明,在包括图像生成和机器人控制在内的多种任务中,PCM相比顺序采样实现了最高2.71倍的加速,相比皮卡德迭代实现了最高1.77倍的加速。