Neural network-based methods have emerged as powerful tools for solving partial differential equations (PDEs) in scientific and engineering applications, particularly when handling complex domains or incorporating empirical data. These methods leverage neural networks as basis functions to approximate PDE solutions. However, training such networks can be challenging, often resulting in limited accuracy. In this paper, we investigate the training dynamics of neural network-based PDE solvers with a focus on the impact of initialization techniques. We assess training difficulty by analyzing the eigenvalue distribution of the kernel and apply the concept of effective rank to quantify this difficulty, where a larger effective rank correlates with faster convergence of the training error. Building upon this, we discover through theoretical analysis and numerical experiments that two initialization techniques, partition of unity (PoU) and variance scaling (VS), enhance the effective rank, thereby accelerating the convergence of training error. Furthermore, comprehensive experiments using popular PDE-solving frameworks, such as PINN, Deep Ritz, and the operator learning framework DeepOnet, confirm that these initialization techniques consistently speed up convergence, in line with our theoretical findings.
翻译:基于神经网络的方法已成为科学和工程应用中求解偏微分方程(PDEs)的强大工具,尤其是在处理复杂域或融合经验数据时。这些方法利用神经网络作为基函数来逼近PDE解。然而,训练此类网络可能具有挑战性,常常导致精度受限。本文研究了基于神经网络的PDE求解器的训练动态,重点关注初始化技术的影响。我们通过分析核函数的特征值分布来评估训练难度,并应用有效秩的概念来量化该难度,其中较大的有效秩与训练误差的更快收敛相关。在此基础上,我们通过理论分析和数值实验发现,两种初始化技术——单位分解(PoU)和方差缩放(VS)——能够提升有效秩,从而加速训练误差的收敛。此外,使用主流PDE求解框架(如PINN、Deep Ritz以及算子学习框架DeepOnet)进行的综合实验证实,这些初始化技术能持续加速收敛,与我们的理论发现一致。