Deep neural networks (DNNs) show great promise for solving partial differential equations (PDEs), but their deep architectures introduce complex, large-scale, non-convex optimization challenges. Nonlinear PDEs, like the viscous Burgers' equation, compound these difficulties due to steep gradients and shock-like solutions. To address this, we propose a two-stage multi-grade deep learning (TS-MGDL) method. In the first stage, shallow networks are trained progressively grade by grade to fit the target function from low- to high-frequency components; previously learned grades are frozen, and each new residual block is trained solely to minimize the remaining approximation error. The second stage unfreezes and retrains selected layers using the first-stage network as initialization, achieving an interpretable, stable hierarchical refinement while mitigating optimization complexity. Furthermore, we theoretically prove that each grade and stage in TS-MGDL monotonically reduces the loss function under an appropriate optimization strategy. Numerical experiments on 1D, 2D, and 3D viscous Burgers' equations demonstrate that TS-MGDL significantly outperforms single-grade learning (SGL), reducing predictive errors by up to a factor of 60.
翻译:深度神经网络(DNNs)在求解偏微分方程(PDEs)方面展现出巨大潜力,但其深层架构带来了复杂、大规模且非凸的优化挑战。非线性PDEs(如黏性Burgers方程)因存在陡峭梯度和类激波解而进一步加剧了这些困难。为此,我们提出了一种两阶段多级深度学习方法(TS-MGDL)。在第一阶段,浅层网络逐级递增地进行训练,以从低频到高频分量逐步拟合目标函数;先前学习的级别被冻结,每个新的残差块仅用于最小化剩余逼近误差。第二阶段以第一阶段网络为初始化,解冻并重新训练选定层,实现可解释、稳定的层次化精化,同时缓解优化复杂性。此外,我们从理论上证明,在适当的优化策略下,TS-MGDL中每个级别和阶段均能单调地减小损失函数。在一维、二维及三维黏性Burgers方程上的数值实验表明,TS-MGDL显著优于单级学习(SGL),预测误差最多降低60倍。