When, why, and how do diffusion posterior samplers fail? A finite-sample lens

Diffusion models have excellent capacity to model complex distributions of natural data, which has made them a popular and effective choice for posterior sampling in imaging inverse problems. Existing methods can incorporate any measurement model at inference time but must use an inexact approximation for the likelihood at intermediate timesteps for computational tractability. Although these approximations can often work well empirically, their downstream effect on the sampled posterior is poorly understood and can result in unexplained failures. To understand when, why, and how these likelihood approximations propagate to erroneous posterior distributions, we introduce a finite-sample perspective on posterior sampling that approximates the posterior to arbitrary precision as training set size tends towards infinity, for any forward model and prior distribution. Using this finite-sample lens, we observe that popular posterior sampling approximations tend to under- or over-estimate the spread of the posterior at intermediate timesteps, causing downstream consequences including sensitivity to early stopping time, inaccurate relative weighting of posterior modes, and hallucination, both of prior modes that are not in the posterior and likelihood modes that are not supported by the prior. Moreover, we find that the cause of these posterior errors requires neither a nonlinear measurement model nor a multimodal posterior, but can arise solely due to a multimodal prior and inaccurate posterior spread at intermediate sampling times. Our finite-sample posterior sampling approach is agnostic to the type of likelihood approximation and the type of (linear or nonlinear) forward model, and can thus serve as a drop-in diagnostic to evaluate the accuracy and failure modes of existing and future posterior samplers.

翻译：扩散模型在建模自然数据的复杂分布方面具有卓越能力，因此成为成像逆问题中后验采样的流行且有效的选择。现有方法可在推理时融入任意测量模型，但为降低计算复杂度，必须在中间时间步对似然函数采用不精确的近似。尽管这些近似通常在经验上表现良好，但它们对采样后验的后续影响尚不明确，且可能导致难以解释的失败。为理解这些似然近似何时、为何以及如何传播为错误的后验分布，我们引入了一个有限样本视角的后验采样方法——当训练集规模趋于无穷大时，该方法能以任意精度逼近真实后验，且适用于任意前向模型和先验分布。通过这一有限样本透镜，我们观察到流行的后验采样近似往往在中间时间步低估或高估后验的分布范围，从而引发一系列后续问题，包括对早期停止时间的敏感性、后验模态相对权重的错误估计，以及幻觉现象——既包含先验中存在但后验中不存在的模态，也包含后验支持但先验缺失的似然模态。此外，我们发现这些后验误差的成因既不需要非线性测量模型，也不要求后验具有多模态性，而仅需多模态先验与中间采样时刻后验展宽不准确这一条件即可产生。我们的有限样本后验采样方法对似然近似类型和（线性或非线性）前向模型类型均保持无关性，因此可作为即插即用的诊断工具，用于评估现有及未来后验采样器的准确性与失效模式。