Generative processes that involve solving differential equations, such as diffusion models, frequently necessitate balancing speed and quality. ODE-based samplers are fast but plateau in performance while SDE-based samplers deliver higher sample quality at the cost of increased sampling time. We attribute this difference to sampling errors: ODE-samplers involve smaller discretization errors while stochasticity in SDE contracts accumulated errors. Based on these findings, we propose a novel sampling algorithm called Restart in order to better balance discretization errors and contraction. The sampling method alternates between adding substantial noise in additional forward steps and strictly following a backward ODE. Empirically, Restart sampler surpasses previous SDE and ODE samplers in both speed and accuracy. Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$. In addition, it attains significantly better sample quality than ODE samplers within comparable sampling times. Moreover, Restart better balances text-image alignment/visual quality versus diversity than previous samplers in the large-scale text-to-image Stable Diffusion model pre-trained on LAION $512 \times 512$. Code is available at https://github.com/Newbeeer/diffusion_restart_sampling
翻译:涉及求解微分方程的生成过程(如扩散模型)通常需要权衡速度与质量。基于常微分方程(ODE)的采样器速度快,但性能会达到瓶颈;而基于随机微分方程(SDE)的采样器能以更长的采样时间为代价,获得更高的样本质量。我们将此差异归因于采样误差:ODE采样器的离散化误差较小,而SDE中的随机性则能抵消累积误差。基于这些发现,我们提出了一种名为Restart的新型采样算法,以更好地平衡离散化误差与误差收缩。该采样方法在额外的正向步骤中增加大量噪声,随后严格遵循反向ODE,两者交替进行。实验结果表明,Restart采样器在速度和精度上均优于以往的SDE和ODE采样器。Restart不仅超越了先前最优的SDE结果,还在CIFAR-10/ImageNet $64 \times 64$上将采样速度提升了10倍/2倍。此外,在相似的采样时间内,它能获得比ODE采样器显著更高的样本质量。更关键的是,在基于LAION $512 \times 512$预训练的大规模文本到图像Stable Diffusion模型中,Restart比以往的采样器更好地平衡了文本-图像对齐/视觉质量与多样性。代码已开源:https://github.com/Newbeeer/diffusion_restart_sampling