Computing expected information gain (EIG) from prior to posterior (equivalently, mutual information between candidate observations and model parameters or other quantities of interest) is a fundamental challenge in Bayesian optimal experimental design. We formulate flexible transport-based schemes for EIG estimation in general nonlinear/non-Gaussian settings, compatible with both standard and implicit Bayesian models. These schemes are representative of two-stage methods for estimating or bounding EIG using marginal and conditional density estimates. In this setting, we analyze the optimal allocation of samples between training (density estimation) and approximation of the outer prior expectation. We show that with this optimal sample allocation, the MSE of the resulting EIG estimator converges more quickly than that of a standard nested Monte Carlo scheme. We then address the estimation of EIG in high dimensions, by deriving gradient-based upper bounds on the mutual information lost by projecting the parameters and/or observations to lower-dimensional subspaces. Minimizing these upper bounds yields projectors and hence low-dimensional EIG approximations that outperform approximations obtained via other linear dimension reduction schemes. Numerical experiments on a PDE-constrained Bayesian inverse problem also illustrate a favorable trade-off between dimension truncation and the modeling of non-Gaussianity, when estimating EIG from finite samples in high dimensions.
翻译:计算从先验到后验的期望信息增益(等价于候选观测与模型参数或其他感兴趣量之间的互信息)是贝叶斯最优实验设计中的基础性挑战。我们提出了灵活的基于传输的EIG估计方案,适用于一般非线性/非高斯场景,兼容标准与隐式贝叶斯模型。这些方案代表了使用边际和条件密度估计来估计或界定EIG的两阶段方法。在此框架下,我们分析了训练(密度估计)阶段与外部先验期望近似阶段之间的样本最优分配问题。研究表明,通过这种最优样本分配,所得EIG估计器的均方误差收敛速度优于标准嵌套蒙特卡洛方案。随后,我们针对高维场景下的EIG估计问题,推导了通过将参数和/或观测投影到低维子空间所损失互信息的梯度上界。最小化这些上界可得到投影算子,从而获得优于其他线性降维方案的低维EIG近似。在偏微分方程约束的贝叶斯反问题数值实验中,当基于高维有限样本估计EIG时,研究还展示了维度截断与非高斯特性建模之间的有利权衡关系。