Neural network-based approaches have recently shown significant promise in solving partial differential equations (PDEs) in science and engineering, especially in scenarios featuring complex domains or the incorporation of empirical data. One advantage of the neural network method for PDEs lies in its automatic differentiation (AD), which necessitates only the sample points themselves, unlike traditional finite difference (FD) approximations that require nearby local points to compute derivatives. In this paper, we quantitatively demonstrate the advantage of AD in training neural networks. The concept of truncated entropy is introduced to characterize the training property. Specifically, through comprehensive experimental and theoretical analyses conducted on random feature models and two-layer neural networks, we discover that the defined truncated entropy serves as a reliable metric for quantifying the residual loss of random feature models and the training speed of neural networks for both AD and FD methods. Our experimental and theoretical analyses demonstrate that, from a training perspective, AD outperforms FD in solving partial differential equations.
翻译:基于神经网络的方法近年来在科学与工程领域求解偏微分方程方面展现出显著潜力,尤其在处理复杂计算域或融合经验数据的场景中。神经网络方法求解偏微分方程的一个优势在于其自动微分能力,该方法仅需样本点本身即可计算导数,而传统有限差分近似则需要邻近局部点来计算导数。本文通过量化分析揭示了自动微分在神经网络训练中的优势。我们引入截断熵的概念来刻画训练特性。具体而言,通过对随机特征模型和双层神经网络开展系统的实验与理论分析,我们发现所定义的截断熵能够可靠地量化随机特征模型的残差损失,以及神经网络在自动微分和有限差分方法下的训练速度。我们的实验与理论分析表明,从训练角度而言,自动微分在求解偏微分方程方面优于有限差分方法。