Denoising diffusions sample from a probability distribution $μ$ in $\mathbb{R}^d$ by constructing a stochastic process $({\hat{\boldsymbol x}}_t:t\ge 0)$ in $\mathbb{R}^d$ such that ${\hat{\boldsymbol x}}_0$ is easy to sample, but the distribution of $\hat{\boldsymbol x}_T$ at large $T$ approximates $μ$. The drift ${\boldsymbol m}:\mathbb{R}^d\times\mathbb{R}\to\mathbb{R}^d$ of this diffusion process is learned my minimizing a score-matching objective. Is every probability distribution $μ$, for which sampling is tractable, also amenable to sampling via diffusions? We provide evidence to the contrary by studying a probability distribution $μ$ for which sampling is easy, but the drift of the diffusion process is intractable -- under a popular conjecture on information-computation gaps in statistical estimation. We show that there exist drifts that are superpolynomially close to the optimum value (among polynomial time drifts) and yet yield samples with distribution that is very far from the target one.
翻译:去噪扩散通过构造$\mathbb{R}^d$中的随机过程$({\hat{\boldsymbol x}}_t:t\ge 0)$从概率分布$μ$中采样,使得${\hat{\boldsymbol x}}_0$易于采样,但大$T$时$\hat{\boldsymbol x}_T$的分布近似$μ$。该扩散过程的漂移项${\boldsymbol m}:\mathbb{R}^d\times\mathbb{R}\to\mathbb{R}^d$通过最小化得分匹配目标来学习。每个易于采样的概率分布$μ$是否也适合通过扩散进行采样?我们通过研究一个采样容易、但扩散过程漂移项难处理的概率分布$μ$——基于统计估计中信息-计算差距的一个流行猜想——提供了相反的证据。我们证明,存在与最优值(在多项式时间漂移项中)超多项式接近的漂移项,但其生成的样本分布却与目标分布相距甚远。