Neural closure models have recently been proposed as a method for efficiently approximating small scales in multiscale systems with neural networks. The choice of loss function and associated training procedure has a large effect on the accuracy and stability of the resulting neural closure model. In this work, we systematically compare three distinct procedures: "derivative fitting", "trajectory fitting" with discretise-then-optimise, and "trajectory fitting" with optimise-then-discretise. Derivative fitting is conceptually the simplest and computationally the most efficient approach and is found to perform reasonably well on one of the test problems (Kuramoto-Sivashinsky) but poorly on the other (Burgers). Trajectory fitting is computationally more expensive but is more robust and is therefore the preferred approach. Of the two trajectory fitting procedures, the discretise-then-optimise approach produces more accurate models than the optimise-then-discretise approach. While the optimise-then-discretise approach can still produce accurate models, care must be taken in choosing the length of the trajectories used for training, in order to train the models on long-term behaviour while still producing reasonably accurate gradients during training. Two existing theorems are interpreted in a novel way that gives insight into the long-term accuracy of a neural closure model based on how accurate it is in the short term.
翻译:神经闭合模型近期被提出作为一种利用神经网络高效近似多尺度系统中微小尺度的方法。损失函数的选择及其相关训练过程对最终神经闭合模型的精度和稳定性有显著影响。本研究系统比较了三种不同的过程:“导数拟合”、“先离散后优化的轨迹拟合”以及“先优化后离散的轨迹拟合”。“导数拟合”在概念上最简单且计算效率最高,在一个测试问题(Kuramoto-Sivashinsky方程)上表现良好,但在另一个测试问题(Burgers方程)上表现不佳。“轨迹拟合”计算开销更大,但鲁棒性更强,因此是更受青睐的方法。在两种“轨迹拟合”过程中,“先离散后优化”方法比“先优化后离散”方法能产生更精确的模型。尽管“先优化后离散”方法仍能生成精确模型,但需谨慎选择用于训练的轨迹长度,以便在训练过程中利用长期行为训练模型,同时生成合理精确的梯度。本文以新颖方式诠释了两个现有定理,从而基于神经闭合模型的短期精度,揭示了其长期精度的内在机理。