Computing expected information gain (EIG) from prior to posterior (equivalently, mutual information between candidate observations and model parameters or other quantities of interest) is a fundamental challenge in Bayesian optimal experimental design. We formulate flexible transport-based schemes for EIG estimation in general nonlinear/non-Gaussian settings, compatible with both standard and implicit Bayesian models. These schemes are representative of two-stage methods for estimating or bounding EIG using marginal and conditional density estimates. In this setting, we analyze the optimal allocation of samples between training (density estimation) and approximation of the outer prior expectation. We show that with this optimal sample allocation, the mean squared error (MSE) of the resulting EIG estimator converges more quickly than that of a standard nested Monte Carlo scheme. We then address the estimation of EIG in high dimensions, by deriving gradient-based upper bounds on the mutual information lost by projecting the parameters and/or observations to lower-dimensional subspaces. Minimizing these upper bounds yields projectors and hence low-dimensional EIG approximations that outperform approximations obtained via other linear dimension reduction schemes. Numerical experiments on a PDE-constrained Bayesian inverse problem also illustrate a favorable trade-off between dimension truncation and the modeling of non-Gaussianity, when estimating EIG from finite samples in high dimensions.
翻译:在贝叶斯最优实验设计中,计算从先验到后验的期望信息增益(等价于候选观测值与模型参数或其他关注量之间的互信息)是一个基础性难题。我们提出了基于传输的灵活方案,用于一般非线性/非高斯场景下的EIG估计,兼容标准与隐式贝叶斯模型。这些方案代表了使用边缘和条件密度估计来估计或界定EIG的两阶段方法。在此框架下,我们分析了训练(密度估计)阶段与外部先验期望近似阶段之间的最优样本分配策略。研究表明,采用该最优样本分配方案时,所得EIG估计量的均方误差收敛速度优于标准嵌套蒙特卡洛方法。随后,我们针对高维EIG估计问题,推导了通过将参数和/或观测值投影到低维子空间所损失互信息的梯度上界。最小化这些上界可得到投影算子,从而获得优于其他线性降维方案的低维EIG近似。在偏微分方程约束的贝叶斯反问题数值实验中,该方案同样展现了在高维有限样本下估计EIG时,维度截断与非高斯特性建模之间的有利权衡关系。