The Neural Tangent Kernel (NTK) viewpoint is widely employed to analyze the training dynamics of overparameterized Physics-Informed Neural Networks (PINNs). However, unlike the case of linear Partial Differential Equations (PDEs), we show how the NTK perspective falls short in the nonlinear scenario. Specifically, we establish that the NTK yields a random matrix at initialization that is not constant during training, contrary to conventional belief. Another significant difference from the linear regime is that, even in the idealistic infinite-width limit, the Hessian does not vanish and hence it cannot be disregarded during training. This motivates the adoption of second-order optimization methods. We explore the convergence guarantees of such methods in both linear and nonlinear cases, addressing challenges such as spectral bias and slow convergence. Every theoretical result is supported by numerical examples with both linear and nonlinear PDEs, and we highlight the benefits of second-order methods in benchmark test cases.
翻译:神经正切核视角被广泛用于分析过参数化物理信息神经网络的训练动态。然而,与线性偏微分方程的情况不同,我们论证了神经正切核视角在非线性场景中的局限性。具体而言,我们证明神经正切核在初始化时产生一个随机矩阵,且该矩阵在训练过程中并非恒定,这与传统认知相悖。与线性体系的另一显著差异在于,即使在理想的无限宽度极限下,海森矩阵也不会消失,因此在训练过程中不可忽略。这一发现促使我们采用二阶优化方法。我们探讨了此类方法在线性和非线性情况下的收敛性保证,并解决了谱偏差与收敛缓慢等挑战。所有理论结果均通过线性和非线性偏微分方程的数值算例验证,并在基准测试案例中凸显了二阶方法的优势。