Gradient-based dimension reduction decreases the cost of Bayesian inference and probabilistic modeling by identifying maximally informative (and informed) low-dimensional projections of the data and parameters, allowing high-dimensional problems to be reformulated as cheaper low-dimensional problems. A broad family of such techniques identify these projections and provide error bounds on the resulting posterior approximations, via eigendecompositions of certain diagnostic matrices. Yet these matrices require gradients or even Hessians of the log-likelihood, excluding the purely data-driven setting and many problems of simulation-based inference. We propose a framework, derived from score-matching, to extend gradient-based dimension reduction to problems where gradients are unavailable. Specifically, we formulate an objective function to directly learn the score ratio function needed to compute the diagnostic matrices, propose a tailored parameterization for the score ratio network, and introduce regularization methods that capitalize on the hypothesized low-dimensional structure. We also introduce a novel algorithm to iteratively identify the low-dimensional reduced basis vectors more accurately with limited data based on eigenvalue deflation methods. We show that our approach outperforms standard score-matching for problems with low-dimensional structure, and demonstrate its effectiveness for PDE-constrained Bayesian inverse problems and conditional generative modeling.
翻译:基于梯度的降维技术通过识别数据和参数中信息量最大(且信息最丰富)的低维投影,将高维问题重构为计算成本更低的低维问题,从而降低贝叶斯推断和概率建模的成本。一系列此类技术通过特定诊断矩阵的特征分解来识别这些投影,并提供对所得后验近似误差的界限。然而,这些矩阵需要计算对数似然的梯度甚至海森矩阵,这排除了纯数据驱动的场景以及许多基于模拟的推断问题。我们提出一个源于得分匹配的框架,将基于梯度的降维方法扩展到梯度不可得的问题中。具体而言,我们构建了一个目标函数来直接学习计算诊断矩阵所需的得分比率函数,为得分比率网络设计了定制化的参数化方案,并引入了利用假设的低维结构的正则化方法。我们还提出了一种基于特征值收缩方法的新算法,能够在有限数据条件下迭代地更精确识别低维降基向量。研究表明,对于具有低维结构的问题,我们的方法优于标准得分匹配,并在偏微分方程约束的贝叶斯反问题与条件生成建模中验证了其有效性。