The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resulting non-convex minimization problem. The method employs the block Gauss-Seidel method as an outer iteration by dividing the parameters of a shallow neural network into the linear parameters (the weights and bias of the output layer) and the non-linear parameters (the weights and bias of the hidden layer). Per each outer iteration, the linear and the non-linear parameters are updated by exact inversion and one step of a damped Newton method, respectively. Inverses of the coefficient matrix and the Hessian matrix are tridiagonal and diagonal, respectively, and hence the cost of each dBN iteration is $\mathcal{O}(n)$. To move the breakpoints (the non-linear parameters) more efficiently, we propose an adaptive damped block Newton (AdBN) method by combining the dBN with the adaptive neuron enhancement (ANE) method [25]. Numerical examples demonstrate the ability of dBN and AdBN not only to move the breakpoints quickly and efficiently but also to achieve a nearly optimal order of convergence for AdBN. These iterative solvers are capable of outperforming BFGS for select examples.
翻译:针对泊松方程的深度Ritz方法[18]离散化会导致高维非凸极小化问题,该问题数值求解困难且计算成本高昂。本文考虑一维扩散问题的浅层Ritz逼近,提出一种高效迭代方法——阻尼块牛顿(dBN)法,用于求解由此产生的非凸极小化问题。该方法采用块高斯-赛德尔迭代作为外循环,将浅层神经网络的参数分为线性参数(输出层的权重与偏置)和非线性参数(隐藏层的权重与偏置)。在每个外循环迭代中,线性参数通过精确求逆更新,非线性参数则通过一步阻尼牛顿法更新。系数矩阵的逆矩阵为三对角矩阵,海森矩阵的逆矩阵为对角矩阵,因此每次dBN迭代的计算复杂度为 $\mathcal{O}(n)$。为更高效地移动断点(非线性参数),我们结合dBN与自适应神经元增强(ANE)方法[25],提出了自适应阻尼块牛顿(AdBN)方法。数值实验表明,dBN和AdBN不仅能够快速有效地移动断点,而且AdBN可达到近乎最优的收敛阶。这些迭代求解器在特定算例中性能优于BFGS方法。