Variational Physics-Informed Neural Networks often suffer from poor convergence when using stochastic gradient-descent-based optimizers. By introducing a Least Squares solver for the weights of the last layer of the neural network, we improve the convergence of the loss during training in most practical scenarios. This work analyzes the computational cost of the resulting hybrid Least-Squares/Gradient-Descent optimizer and explains how to implement it efficiently. In particular, we show that a traditional implementation based on backward-mode automatic differentiation leads to a prohibitively expensive algorithm. To remedy this, we propose using either forward-mode automatic differentiation or an ultraweak-type scheme that avoids the differentiation of trial functions in the discrete weak formulation. The proposed alternatives are up to 100 times faster than the traditional one, recovering a computational cost-per-iteration similar to that of a conventional gradient-descent-based optimizer alone. To support our analysis, we derive computational estimates and conduct numerical experiments in one- and two-dimensional problems.
翻译:变分物理信息神经网络在使用基于随机梯度下降的优化器时,常面临收敛性不佳的问题。通过为神经网络最后一层权重引入最小二乘求解器,我们在大多数实际场景中改善了训练过程中损失的收敛性。本文分析了由此产生的混合最小二乘/梯度下降优化器的计算成本,并阐述了如何高效实现该方法。特别地,我们指出基于反向模式自动微分的传统实现方式会导致算法计算代价过高。为解决此问题,我们提出采用前向模式自动微分,或采用超弱形式方案——该方案可避免在离散弱形式中对试探函数进行微分。所提出的替代方案比传统方法快达100倍,使每次迭代的计算成本恢复到与单独使用传统梯度下降优化器相近的水平。为支撑理论分析,我们推导了计算量估计,并在一维和二维问题上进行了数值实验。