Score-based diffusion models have significantly advanced high-dimensional data generation across various domains, by learning a denoising oracle (or score) from datasets. From a Bayesian perspective, they offer a realistic modeling of data priors and facilitate solving inverse problems through posterior sampling. Although many heuristic methods have been developed recently for this purpose, they lack the quantitative guarantees needed in many scientific applications. In this work, we introduce the \textit{tilted transport} technique, which leverages the quadratic structure of the log-likelihood in linear inverse problems in combination with the prior denoising oracle to transform the original posterior sampling problem into a new `boosted' posterior that is provably easier to sample from. We quantify the conditions under which this boosted posterior is strongly log-concave, highlighting the dependencies on the condition number of the measurement matrix and the signal-to-noise ratio. The resulting posterior sampling scheme is shown to reach the computational threshold predicted for sampling Ising models [Kunisky'23] with a direct analysis, and is further validated on high-dimensional Gaussian mixture models and scalar field $\varphi^4$ models.
翻译:基于分数的扩散模型通过学习数据集中的去噪神谕(或分数函数),在多个领域显著推动了高维数据生成的发展。从贝叶斯视角看,这类模型为数据先验提供了切实可行的建模方法,并通过后验采样促进逆问题的求解。尽管近期已涌现许多启发式方法用于此目的,但它们缺乏众多科学应用所需的定量保证。本研究提出\textit{倾斜传输}技术,该方法在线性逆问题中结合对数似然的二次结构特性与先验去噪神谕,将原始后验采样问题转化为可证明更易采样的"增强型"后验分布。我们量化了该增强后验满足强对数凹性的条件,重点揭示了其对测量矩阵条件数与信噪比的依赖关系。通过直接分析证明,所得后验采样方案能达到[Kunisky'23]中为伊辛模型采样预测的计算阈值,并在高维高斯混合模型与标量场$\varphi^4$模型上得到进一步验证。