Recent advances in diffusion models attempt to handle conditional generative tasks by utilizing a differentiable loss function for guidance without the need for additional training. While these methods achieved certain success, they often compromise on sample quality and require small guidance step sizes, leading to longer sampling processes. This paper reveals that the fundamental issue lies in the manifold deviation during the sampling process when loss guidance is employed. We theoretically show the existence of manifold deviation by establishing a certain lower bound for the estimation error of the loss guidance. To mitigate this problem, we propose Diffusion with Spherical Gaussian constraint (DSG), drawing inspiration from the concentration phenomenon in high-dimensional Gaussian distributions. DSG effectively constrains the guidance step within the intermediate data manifold through optimization and enables the use of larger guidance steps. Furthermore, we present a closed-form solution for DSG denoising with the Spherical Gaussian constraint. Notably, DSG can seamlessly integrate as a plugin module within existing training-free conditional diffusion methods. Implementing DSG merely involves a few lines of additional code with almost no extra computational overhead, yet it leads to significant performance improvements. Comprehensive experimental results in various conditional generation tasks validate the superiority and adaptability of DSG in terms of both sample quality and time efficiency.
翻译:近期扩散模型的进展尝试利用可微损失函数进行引导,无需额外训练即可处理条件生成任务。尽管这些方法取得了一定成功,但往往以牺牲样本质量为代价,且需要较小的引导步长,导致采样过程较长。本文揭示了根本问题在于使用损失引导时采样过程中的流形偏离。我们通过建立损失引导估计误差的特定下界,从理论上证明了流形偏离的存在性。为缓解该问题,受高维高斯分布中浓度现象的启发,我们提出带球面高斯约束的扩散模型(DSG)。DSG通过优化有效约束引导步骤在中间数据流形内,并支持使用更大的引导步长。进一步,我们提出了带球面高斯约束的DSG去噪闭式解。值得注意的是,DSG可作为插件模块无缝集成到现有免训练条件扩散方法中。实现DSG仅需额外添加数行代码,几乎不增加计算开销,却能带来显著的性能提升。在多种条件生成任务上的综合实验结果验证了DSG在样本质量与时间效率方面的优越性与适应性。