Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, especially for transport-dominated and high-dimensional problems that exhibit local features such as waves and coherent structures. Thus, estimators based on data samples from un-informed, uniform distributions are inefficient. This work introduces Neural Galerkin schemes that estimate the training loss with data from adaptive distributions, which are empirically represented via ensembles of particles. The ensembles are actively adapted by evolving the particles with dynamics coupled to the nonlinear parametrizations of the solution fields so that the ensembles remain informative for estimating the training loss. Numerical experiments indicate that few dynamic particles are sufficient for obtaining accurate empirical estimates of the training loss, even for problems with local features and with high-dimensional spatial domains.
翻译:训练深度神经网络等非线性参数化模型以数值近似偏微分方程的解,通常基于最小化包含残差的损失函数,而该类残差仅在有限场景下可解析获取。同时,经验估计训练损失具有挑战性,因为残差及相关量可能存在高方差,尤其在以波和相干结构等局部特征为特征的输运主导型及高维问题中。因此,基于无信息均匀分布数据样本的估计器效率低下。本文提出神经Galerkin方案,通过自适应分布的数据估计训练损失,该分布由粒子系综进行经验表征。通过耦合解场非线性参数化与粒子动力学演化来主动调整系综,使其在估计训练损失时保持信息有效性。数值实验表明,即使针对具有局部特征和高维空间域的问题,仅需少量动态粒子即可获得训练损失的精确经验估计。