Solving High-Dimensional Inverse Problems with Auxiliary Uncertainty via Operator Learning with Limited Data

In complex large-scale systems such as climate, important effects are caused by a combination of confounding processes that are not fully observable. The identification of sources from observations of system state is vital for attribution and prediction, which inform critical policy decisions. The difficulty of these types of inverse problems lies in the inability to isolate sources and the cost of simulating computational models. Surrogate models may enable the many-query algorithms required for source identification, but data challenges arise from high dimensionality of the state and source, limited ensembles of costly model simulations to train a surrogate model, and few and potentially noisy state observations for inversion due to measurement limitations. The influence of auxiliary processes adds an additional layer of uncertainty that further confounds source identification. We introduce a framework based on (1) calibrating deep neural network surrogates to the flow maps provided by an ensemble of simulations obtained by varying sources, and (2) using these surrogates in a Bayesian framework to identify sources from observations via optimization. Focusing on an atmospheric dispersion exemplar, we find that the expressive and computationally efficient nature of the deep neural network operator surrogates in appropriately reduced dimension allows for source identification with uncertainty quantification using limited data. Introducing a variable wind field as an auxiliary process, we find that a Bayesian approximation error approach is essential for reliable source inversion when uncertainty due to wind stresses the algorithm.

翻译：在气候等复杂大规模系统中，重要效应由多种未完全可观测的混杂过程共同引发。从系统状态观测中识别源项对归因与预测至关重要，这直接关系到关键政策决策。此类逆问题的难度源于无法隔离源项以及计算模型模拟的高昂成本。代理模型虽能支持源项识别所需的多查询算法，但面临三重数据挑战：状态与源项的高维性、训练代理模型所需昂贵模型模拟的有限集成数，以及测量限制导致的少量且可能存在噪声的状态观测。辅助过程的影响增加了额外不确定性，进一步干扰源项识别。我们提出一种基于以下两点的框架：(1) 利用不同源项生成的模拟集合对深度神经网络代理进行校准，使其逼近流映射；(2) 在贝叶斯框架中运用这些代理，通过优化从观测中识别源项。以大气扩散范例为焦点，我们发现深度神经网络算子代理在适当降维后具有高表达力与计算效率，能够利用有限数据进行带不确定性量化的源项识别。引入可变风场作为辅助过程后，我们发现当风场不确定性对算法造成压力时，贝叶斯近似误差方法对实现可靠源项反演至关重要。