We consider the robustness of score-based generative modeling to errors in the estimate of the score function. In particular, we show that Langevin dynamics is not robust to the $L^2$ errors (more generally $L^p$ errors) in the estimate of the score function. It is well-established that with small $L^2$ errors in the estimate of the score function, diffusion models can sample faithfully from the target distribution under fairly mild regularity assumptions in a polynomial time horizon. In contrast, our work shows that even for simple distributions in high dimensions, Langevin dynamics run for any polynomial time horizon will produce a distribution far from the target distribution in Total Variation (TV) distance, even when the $L^2$ error (more generally $L^p$) of the estimate of the score function is arbitrarily small. Considering such an error in the estimate of the score function is unavoidable in practice when learning the score function from data, our results provide further justification for diffusion models over Langevin dynamics and serve to caution against the use of Langevin dynamics with estimated scores.
翻译:本文研究了基于评分的生成模型对评分函数估计误差的鲁棒性。具体而言,我们证明了朗之万动力学对评分函数估计中的$L^2$误差(更一般地,$L^p$误差)并不鲁棒。现有研究已充分表明,在评分函数估计误差较小的条件下,扩散模型能够在相当温和的正则性假设下,于多项式时间范围内从目标分布中高保真采样。相比之下,我们的工作揭示,即使对于高维空间中的简单分布,当运行时间为任意多项式时间范围时,即使评分函数估计的$L^2$误差(更一般地,$L^p$误差)可任意小,朗之万动力学仍会生成偏离目标分布的分布(以总变差距离度量)。鉴于从数据中学习评分函数时此类估计误差在实践中不可避免,本研究结果进一步为扩散模型优于朗之万动力学提供了理论支撑,并警示使用带估计评分的朗之万动力学可能存在的风险。