In recent times machine learning methods have made significant advances in becoming a useful tool for analyzing physical systems. A particularly active area in this theme has been "physics-informed machine learning" which focuses on using neural nets for numerically solving differential equations. In this work, we aim to advance the theory of measuring out-of-sample error while training DeepONets - which is among the most versatile ways to solve P.D.E systems in one-shot. Firstly, for a class of DeepONets, we prove a bound on their Rademacher complexity which does not explicitly scale with the width of the nets involved. Secondly, we use this to show how the Huber loss can be chosen so that for these DeepONet classes generalization error bounds can be obtained that have no explicit dependence on the size of the nets. The effective capacity measure for DeepONets that we thus derive is also shown to correlate with the behavior of generalization error in experiments.
翻译:近年来,机器学习方法在成为分析物理系统的有效工具方面取得了显著进展。该领域中一个尤为活跃的方向是“物理信息机器学习”,其重点在于利用神经网络数值求解微分方程。在本研究中,我们致力于推进训练DeepONets时样本外误差度量的理论——DeepONets是目前最通用的单次求解偏微分方程系统的方法之一。首先,针对一类DeepONets,我们证明了其Rademacher复杂度的上界,该上界不随网络宽度显式缩放。其次,我们利用此结论展示了如何选择Huber损失函数,使得针对这些DeepONet类别能够获得不显式依赖于网络规模的泛化误差界。我们由此推导出的DeepONets有效容量度量,在实验中也显示出与泛化误差行为的相关性。