Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective

A burgeoning line of research leverages deep neural networks to approximate the solutions to high dimensional PDEs, opening lines of theoretical inquiry focused on explaining how it is that these models appear to evade the curse of dimensionality. However, most prior theoretical analyses have been limited to linear PDEs. In this work, we take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs. We focus on a class of PDEs known as \emph{nonlinear elliptic variational PDEs}, whose solutions minimize an \emph{Euler-Lagrange} energy functional $\mathcal{E}(u) = \int_\Omega L(x, u(x), \nabla u(x)) - f(x) u(x)dx$. We show that if composing a function with Barron norm $b$ with partial derivatives of $L$ produces a function of Barron norm at most $B_L b^p$, the solution to the PDE can be $\epsilon$-approximated in the $L^2$ sense by a function with Barron norm $O\left(\left(dB_L\right)^{\max\{p \log(1/ \epsilon), p^{\log(1/\epsilon)}\}}\right)$. By a classical result due to Barron [1993], this correspondingly bounds the size of a 2-layer neural network needed to approximate the solution. Treating $p, \epsilon, B_L$ as constants, this quantity is polynomial in dimension, thus showing neural networks can evade the curse of dimensionality. Our proof technique involves neurally simulating (preconditioned) gradient in an appropriate Hilbert space, which converges exponentially fast to the solution of the PDE, and such that we can bound the increase of the Barron norm at each iterate. Our results subsume and substantially generalize analogous prior results for linear elliptic PDEs over a unit hypercube.

翻译：一种新兴的研究方向利用深度神经网络逼近高维偏微分方程的解，开辟了理论探索的新路径，旨在解释这些模型如何似乎规避了维度灾难。然而，大多数先前的理论分析仅限于线性偏微分方程。在本工作中，我们朝着研究神经网络逼近非线性偏微分方程解的表示能力迈出了一步。我们聚焦于一类称为\emph{非线性椭圆变分偏微分方程}的PDE，其解最小化一个\emph{欧拉-拉格朗日}能量泛函 $\mathcal{E}(u) = \int_\Omega L(x, u(x), \nabla u(x)) - f(x) u(x)dx$。我们证明，如果将一个具有Barron范数 $b$ 的函数与 $L$ 的偏导数复合后，产生的函数的Barron范数至多为 $B_L b^p$，则该PDE的解可以在 $L^2$ 意义下被一个Barron范数为 $O\left(\left(dB_L\right)^{\max\{p \log(1/ \epsilon), p^{\log(1/\epsilon)}\}}\right)$ 的函数 $\epsilon$-逼近。根据Barron [1993]的经典结果，这相应地限制了逼近该解所需的2层神经网络的规模。将 $p, \epsilon, B_L$ 视为常数，这一量在维度上是多项式级的，从而表明神经网络可以规避维度灾难。我们的证明技术涉及在适当的希尔伯特空间中对（预条件）梯度进行神经模拟，该梯度以指数速度收敛到PDE的解，并且我们可以限制每次迭代中Barron范数的增长。我们的结果概括并实质性地推广了关于单位超立方体上线性椭圆偏微分方程的类似先前结果。