Efficient geometric Markov chain Monte Carlo for nonlinear Bayesian inversion enabled by derivative-informed neural operators

We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional nonlinear Bayesian inverse problems. While geometric MCMC employs high-quality proposals that adapt to posterior local geometry, it requires computing local gradient and Hessian information of the log-likelihood, incurring a high cost when the parameter-to-observable (PtO) map is defined through expensive model simulations. We consider a delayed-acceptance geometric MCMC method driven by a neural operator surrogate of the PtO map, where the proposal is designed to exploit fast surrogate approximations of the log-likelihood and, simultaneously, its gradient and Hessian. To achieve a substantial speedup, the surrogate needs to be accurate in predicting both the observable and its parametric derivative (the derivative of the observable with respect to the parameter). Training such a surrogate via conventional operator learning using input--output samples often demands a prohibitively large number of model simulations. In this work, we present an extension of derivative-informed operator learning [O'Leary-Roseberry et al., J. Comput. Phys., 496 (2024)] using input--output--derivative training samples. Such a learning method leads to derivative-informed neural operator (DINO) surrogates that accurately predict the observable and its parametric derivative at a significantly lower training cost than the conventional method. Cost and error analysis for reduced basis DINO surrogates are provided. Numerical studies on PDE-constrained Bayesian inversion demonstrate that DINO-driven MCMC generates effective posterior samples 3--9 times faster than geometric MCMC and 60--97 times faster than prior geometry-based MCMC. Furthermore, the training cost of DINO surrogates breaks even after collecting merely 10--25 effective posterior samples compared to geometric MCMC.

翻译：本文提出一种算子学习方法来加速几何马尔可夫链蒙特卡洛（MCMC）方法，用于求解无限维非线性贝叶斯反问题。几何MCMC采用适应后验局部几何形态的高质量提议分布，但需计算对数似然函数的局部梯度与Hessian信息，当参数-可观测（PtO）映射由昂贵模型模拟定义时，计算成本高昂。我们提出一种由PtO映射神经算子代理模型驱动的延迟接受几何MCMC方法，其提议分布设计为同时利用对数似然函数的快速代理近似及其梯度与Hessian信息。要实现显著加速，代理模型需在预测可观测值及其参数导数（可观测值关于参数的导数）时均保持高精度。传统以输入-输出样本训练此类代理模型的算子学习方法，往往需要使用天文数量的模型模拟。本文提出导数信息算子学习的扩展方法 [O'Leary-Roseberry et al., J. Comput. Phys., 496 (2024)]，采用输入-输出-导数训练样本。该学习方法可构建导数信息神经算子（DINO）代理模型，其在显著低于传统方法的训练成本下，即可准确预测可观测值及其参数导数。文中给出了降阶基DINO代理模型的成本与误差分析。基于偏微分方程约束贝叶斯反演的数值实验表明，DINO驱动的MCMC生成有效后验样本的速度比几何MCMC快3-9倍，比先验几何MCMC快60-97倍。此外，与几何MCMC相比，DINO代理模型的训练成本仅需采集10-25个有效后验样本即可实现收支平衡。