Numerically solving high-dimensional partial differential equations (PDEs) is a major challenge. Conventional methods, such as finite difference methods, are unable to solve high-dimensional PDEs due to the curse-of-dimensionality. A variety of deep learning methods have been recently developed to try and solve high-dimensional PDEs by approximating the solution using a neural network. In this paper, we prove global convergence for one of the commonly-used deep learning algorithms for solving PDEs, the Deep Galerkin Method (DGM). DGM trains a neural network approximator to solve the PDE using stochastic gradient descent. We prove that, as the number of hidden units in the single-layer network goes to infinity (i.e., in the ``wide network limit"), the trained neural network converges to the solution of an infinite-dimensional linear ordinary differential equation (ODE). The PDE residual of the limiting approximator converges to zero as the training time $\rightarrow \infty$. Under mild assumptions, this convergence also implies that the neural network approximator converges to the solution of the PDE. A closely related class of deep learning methods for PDEs is Physics Informed Neural Networks (PINNs). Using the same mathematical techniques, we can prove a similar global convergence result for the PINN neural network approximators. Both proofs require analyzing a kernel function in the limit ODE governing the evolution of the limit neural network approximator. A key technical challenge is that the kernel function, which is a composition of the PDE operator and the neural tangent kernel (NTK) operator, lacks a spectral gap, therefore requiring a careful analysis of its properties.
翻译:数值求解高维偏微分方程是一项重大挑战。传统方法(如有限差分法)因维度灾难而无法求解高维偏微分方程。近年来,多种深度学习方法被开发用于尝试通过神经网络逼近解来解决高维偏微分方程。本文证明了一种常用深度学习算法——深度伽辽金方法求解偏微分方程的全局收敛性。深度伽辽金方法通过随机梯度下降训练神经网络逼近器以求解偏微分方程。我们证明,当单层网络的隐藏单元数趋于无穷(即“宽网络极限”)时,训练后的神经网络收敛于一个无限维线性常微分方程的解。当训练时间趋于无穷时,极限逼近器的偏微分方程残差收敛于零。在温和假设下,该收敛性还意味着神经网络逼近器收敛于偏微分方程的解。另一类密切相关的偏微分方程深度学习方法为物理信息神经网络。利用相同的数学技术,我们可以证明物理信息神经网络逼近器的类似全局收敛性结果。两个证明都需要分析极限常微分方程中控制极限神经网络逼近器演化的核函数。一个关键的技术挑战在于该核函数(由偏微分方程算子与神经正切核算子复合而成)缺乏谱间隙,因此需要对其性质进行精细分析。