Training neural networks sequentially in time to approximate solution fields of time-dependent partial differential equations can be beneficial for preserving causality and other physics properties; however, the sequential-in-time training is numerically challenging because training errors quickly accumulate and amplify over time. This work introduces Neural Galerkin schemes that update randomized sparse subsets of network parameters at each time step. The randomization avoids overfitting locally in time and so helps prevent the error from accumulating quickly over the sequential-in-time training, which is motivated by dropout that addresses a similar issue of overfitting due to neuron co-adaptation. The sparsity of the update reduces the computational costs of training without losing expressiveness because many of the network parameters are redundant locally at each time step. In numerical experiments with a wide range of evolution equations, the proposed scheme with randomized sparse updates is up to two orders of magnitude more accurate at a fixed computational budget and up to two orders of magnitude faster at a fixed accuracy than schemes with dense updates.
翻译:在时间上顺序训练神经网络以逼近含时偏微分方程的解场,可有效保持因果性及其他物理特性;然而,顺序时间训练在数值上极具挑战性,因为训练误差会随时间快速累积和放大。本文提出一种神经Galerkin格式,其在每个时间步更新网络参数的随机稀疏子集。随机化避免了局部时间过拟合,从而有助于防止顺序时间训练中误差快速累积——这一设计灵感来源于通过抑制神经元共适应处理相似过拟合问题的Dropout技术。更新过程的稀疏性降低了训练计算成本且不损失表达能力,这是因为每个时间步上多数网络参数局部冗余。在涵盖广泛进化方程的数值实验中,所提随机稀疏更新格式在固定计算预算下的精度提升高达两个数量级,在固定精度下的计算速度提升亦达两个数量级,显著优于密集更新格式。