Diffusion models have achieved state-of-the-art performance in generative modeling, yet their sampling procedures remain vulnerable to hallucinations-often stemming from inaccuracies in score approximation. In this work, we reinterpret diffusion sampling through the lens of optimization and introduce RODS (Robust Optimization-inspired Diffusion Sampler), a novel method that detects and corrects high-risk sampling steps using geometric cues from the loss landscape. RODS enforces smoother sampling trajectories and adaptively adjusts perturbations, reducing hallucinations without retraining and at minimal additional inference cost. Experiments on AFHQv2, FFHQ, and 11k-hands demonstrate that RODS maintains comparable image quality and preserves generation diversity. More importantly, it improves both sampling fidelity and robustness, detecting over 70% of hallucinated samples and correcting more than 25%, all while avoiding the introduction of new artifacts. We release our code at https://github.com/Yiqi-Verna-Tian/RODS.
翻译:扩散模型在生成建模领域已取得最先进的性能,但其采样过程仍易受幻觉影响——这通常源于分数估计的不准确性。本研究从优化的视角重新阐释扩散采样,并提出了RODS(鲁棒优化启发的扩散采样器),这是一种利用损失函数景观的几何线索来检测和校正高风险采样步骤的新方法。RODS通过强制更平滑的采样轨迹并自适应地调整扰动,无需重新训练且仅需极少的额外推理成本,即可有效减少幻觉。在AFHQv2、FFHQ和11k-hands数据集上的实验表明,RODS在保持可比图像质量和生成多样性的同时,显著提升了采样保真度和鲁棒性。该方法能够检测超过70%的幻觉样本并校正其中25%以上,且不会引入新的伪影。代码已发布于 https://github.com/Yiqi-Verna-Tian/RODS。