Denoising Diffusion Probabilistic Models (DDPM) have shown remarkable efficacy in the synthesis of high-quality images. However, their inference process characteristically requires numerous, potentially hundreds, of iterative steps, which could lead to the problem of exposure bias due to the accumulation of prediction errors over iterations. Previous work has attempted to mitigate this issue by perturbing inputs during training, which consequently mandates the retraining of the DDPM. In this work, we conduct a systematic study of exposure bias in diffusion models and, intriguingly, we find that the exposure bias could be alleviated with a new sampling method, without retraining the model. We empirically and theoretically show that, during inference, for each backward time step $t$ and corresponding state $\hat{x}_t$, there might exist another time step $t_s$ which exhibits superior coupling with $\hat{x}_t$. Based on this finding, we introduce an inference method named Time-Shift Sampler. Our framework can be seamlessly integrated with existing sampling algorithms, such as DDIM or DDPM, inducing merely minimal additional computations. Experimental results show that our proposed framework can effectively enhance the quality of images generated by existing sampling algorithms.
翻译:去噪扩散概率模型(DDPM)在高质量图像合成中展现出显著效能。然而,其推理过程通常需要数百个迭代步骤,这可能导致预测误差随时间累积而引发暴露偏差问题。先前研究尝试通过在训练阶段对输入施加扰动来缓解该问题,但这要求重新训练DDPM模型。本文对扩散模型中的暴露偏差进行了系统性研究,有趣地发现,无需重新训练模型,即可通过新采样方法减轻暴露偏差。我们从经验与理论层面证明:在推理过程中,针对每个反向时间步$t$及其对应状态$\hat{x}_t$,可能存在另一个时间步$t_s$,其与$\hat{x}_t$具有更优的耦合性。基于此发现,我们提出名为"时移采样器"的推理方法。该框架可与DDIM或DDPM等现有采样算法无缝集成,仅引入极少量额外计算。实验结果表明,所提框架能有效提升现有采样算法生成的图像质量。