Solving inverse problems using Bayesian methods can become prohibitively expensive when likelihood evaluations involve complex and large scale numerical models. A common approach to circumvent this issue is to approximate the forward model or the likelihood function with a surrogate model. But also there, due to limited computational resources, only a few training points are available in many practically relevant cases. Thus, it can be advantageous to model the additional uncertainties of the surrogate in order to incorporate the epistemic uncertainty due to limited data. In this paper, we develop a novel approach to approximate the log likelihood by a constrained Gaussian process based on prior knowledge about its boundedness. This improves the accuracy of the surrogate approximation without increasing the number of training samples. Additionally, we introduce a formulation to integrate the epistemic uncertainty due to limited training points into the posterior density approximation. This is combined with a state of the art active learning strategy for selecting training points, which allows to approximate posterior densities in higher dimensions very efficiently. We demonstrate the fast convergence of our approach for a benchmark problem and infer a random field that is discretized by 30 parameters using only about 1000 model evaluations. In a practically relevant example, the parameters of a reduced lung model are calibrated based on flow observations over time and voltage measurements from a coupled electrical impedance tomography simulation.
翻译:在似然函数评估涉及复杂大规模数值模型时,采用贝叶斯方法求解逆问题可能因计算成本过高而难以实现。解决此问题的常用方法是使用代理模型近似正向模型或似然函数,然而在实际问题中,有限的计算资源通常仅能提供少量训练点。因此,对代理模型的额外不确定性进行建模以纳入因数据稀缺引发的认知不确定性具有重要价值。本文提出一种新颖方法,利用似然函数有界性的先验知识,通过约束高斯过程逼近对数似然函数。该方法在不增加训练样本数量的前提下提升了代理近似精度。同时,我们引入一种将训练点不足导致的认知不确定性整合到后验密度近似中的数学框架。结合当前先进的主动学习策略选择训练点,该方法能够高效逼近高维空间的后验密度。我们通过基准问题验证了该方法的快速收敛性:利用仅约1000次模型评估,推断出由30个参数离散化的随机场。在实际应用中,基于随时间变化的流动观测数据与耦合电阻抗断层成像仿真的电压测量值,实现了肺简化模型参数的标定。