Operator Learning with Gaussian Processes

Operator learning focuses on approximating mappings $\mathcal{G}^\dagger:\mathcal{U} \rightarrow\mathcal{V}$ between infinite-dimensional spaces of functions, such as $u: \Omega_u\rightarrow\mathbb{R}$ and $v: \Omega_v\rightarrow\mathbb{R}$. This makes it particularly suitable for solving parametric nonlinear partial differential equations (PDEs). While most machine learning methods for operator learning rely on variants of deep neural networks (NNs), recent studies have shown that Gaussian Processes (GPs) are also competitive while offering interpretability and theoretical guarantees. In this paper, we introduce a hybrid GP/NN-based framework for operator learning that leverages the strengths of both methods. Instead of approximating the function-valued operator $\mathcal{G}^\dagger$, we use a GP to approximate its associated real-valued bilinear form $\widetilde{\mathcal{G}}^\dagger: \mathcal{U}\times\mathcal{V}^*\rightarrow\mathbb{R}.$ This bilinear form is defined by $\widetilde{\mathcal{G}}^\dagger(u,\varphi) := [\varphi,\mathcal{G}^\dagger(u)],$ which allows us to recover the operator $\mathcal{G}^\dagger$ through $\mathcal{G}^\dagger(u)(y)=\widetilde{\mathcal{G}}^\dagger(u,\delta_y).$ The GP mean function can be zero or parameterized by a neural operator and for each setting we develop a robust training mechanism based on maximum likelihood estimation (MLE) that can optionally leverage the physics involved. Numerical benchmarks show that (1) it improves the performance of a base neural operator by using it as the mean function of a GP, and (2) it enables zero-shot data-driven models for accurate predictions without prior training. Our framework also handles multi-output operators where $\mathcal{G}^\dagger:\mathcal{U} \rightarrow\prod_{s=1}^S\mathcal{V}^s$, and benefits from computational speed-ups via product kernel structures and Kronecker product matrix representations.

翻译：算子学习专注于逼近无限维函数空间之间的映射 $\mathcal{G}^\dagger:\mathcal{U} \rightarrow\mathcal{V}$，例如 $u: \Omega_u\rightarrow\mathbb{R}$ 和 $v: \Omega_v\rightarrow\mathbb{R}$。这使得它特别适用于求解参数化非线性偏微分方程（PDEs）。虽然大多数用于算子学习的机器学习方法依赖于深度神经网络（NNs）的变体，但最近的研究表明，高斯过程（GPs）同样具有竞争力，同时提供了可解释性和理论保证。在本文中，我们提出了一种基于 GP/NN 混合的算子学习框架，该框架结合了两种方法的优势。我们不是直接逼近函数值算子 $\mathcal{G}^\dagger$，而是使用一个 GP 来逼近其关联的实值双线性形式 $\widetilde{\mathcal{G}}^\dagger: \mathcal{U}\times\mathcal{V}^*\rightarrow\mathbb{R}$。该双线性形式定义为 $\widetilde{\mathcal{G}}^\dagger(u,\varphi) := [\varphi,\mathcal{G}^\dagger(u)]$，这使得我们可以通过 $\mathcal{G}^\dagger(u)(y)=\widetilde{\mathcal{G}}^\dagger(u,\delta_y)$ 来恢复算子 $\mathcal{G}^\dagger$。GP 的均值函数可以为零，也可以由神经算子参数化；针对每种设置，我们开发了一种基于最大似然估计（MLE）的鲁棒训练机制，该机制可以选择性地利用所涉及的物理知识。数值基准测试表明：（1）通过将基础神经算子用作 GP 的均值函数，可以提升其性能；（2）它能够实现零样本数据驱动模型，从而无需先验训练即可进行准确预测。我们的框架还能处理多输出算子 $\mathcal{G}^\dagger:\mathcal{U} \rightarrow\prod_{s=1}^S\mathcal{V}^s$，并受益于通过乘积核结构和 Kronecker 乘积矩阵表示带来的计算加速。