Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot. In this work, we aim to establish a first-of-its-kind data-dependent lowerbound on the size of DeepONets required for them to be able to reduce empirical error on noisy data. In particular, we show that for low training errors to be obtained on $n$ data points it is necessary that the common output dimension of the branch and the trunk net be scaling as $\Omega \left ( {\sqrt{n}} \right )$. This inspires our experiments with DeepONets solving the advection-diffusion-reaction PDE, where we demonstrate the possibility that at a fixed model size, to leverage increase in this common output dimension and get monotonic lowering of training error, the size of the training data might necessarily need to scale quadratically with it.
翻译:深度算子网络是近年来求解无限维回归问题并一次性求解偏微分方程组的一种日益流行的范式。本研究旨在建立首个依赖于数据集的DeepONet尺寸下界,使其能够降低含噪声数据上的经验误差。具体而言,我们证明:要在$n$个数据点上获得较低的训练误差,分支网络与主干网络的公共输出维度必须达到$\Omega \left ( {\sqrt{n}} \right )$的量级。这启发我们开展了针对对流-扩散-反应型偏微分方程的DeepONet实验,实验表明:在固定模型尺寸下,若要利用公共输出维度的增加来单调降低训练误差,训练数据规模可能需与该维度呈二次方关系。