We introduce a novel two-level overlapping additive Schwarz preconditioner for accelerating the training of scientific machine learning applications. The design of the proposed preconditioner is motivated by the nonlinear two-level overlapping additive Schwarz preconditioner. The neural network parameters are decomposed into groups (subdomains) with overlapping regions. In addition, the network's feed-forward structure is indirectly imposed through a novel subdomain-wise synchronization strategy and a coarse-level training step. Through a series of numerical experiments, which consider physics-informed neural networks and operator learning approaches, we demonstrate that the proposed two-level preconditioner significantly speeds up the convergence of the standard (LBFGS) optimizer while also yielding more accurate machine learning models. Moreover, the devised preconditioner is designed to take advantage of model-parallel computations, which can further reduce the training time.
翻译:本文提出了一种新颖的两级重叠加性Schwarz预处理器,用于加速科学机器学习应用的训练过程。该预处理器的设计灵感来源于非线性两级重叠加性Schwarz预处理器。神经网络参数被分解为具有重叠区域的若干组(子域)。此外,通过一种新颖的子域同步策略和粗粒度训练步骤,间接地引入了网络的前馈结构。通过一系列数值实验(涵盖物理信息神经网络与算子学习方法),我们证明所提出的两级预处理器能显著加速标准(LBFGS)优化器的收敛速度,同时得到更精确的机器学习模型。此外,所设计的预处理器能够充分利用模型并行计算的优势,从而进一步缩短训练时间。