Fast Iterative Solver For Neural Network Method: II. 1D Diffusion-Reaction Problems And Data Fitting

This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least ${\cal O}(n^4)$. We present a factorization of the mass matrix that enables solving the systems of linear equations in ${\cal O}(n)$ operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is ${\cal O}(n)$. Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples.

翻译：本文扩展了近期在文献[4]中提出的阻尼块牛顿（dBN）方法，用于求解一维扩散-反应方程及最小二乘数据拟合问题。为确定神经网络（NN）的线性参数（输出层的权重与偏置），dBN方法需要求解涉及质量矩阵的线性方程组。虽然局部帽基函数对应的质量矩阵是三对角且良态的，但神经网络对应的质量矩阵是稠密且病态的。例如，在拟均匀网格上，神经网络质量矩阵的条件数至少为 ${\cal O}(n^4)$。我们提出了一种质量矩阵的分解方法，使得线性方程组的求解可在 ${\cal O}(n)$ 次运算内完成。为确定非线性参数（隐藏层的权重与偏置），每次迭代采用一步阻尼牛顿法。在 Hessian 矩阵奇异的情况下，使用高斯-牛顿法替代牛顿法，此改进方法记为 dBGN。对于这两种方法，每次迭代的计算成本均为 ${\cal O}(n)$。数值结果表明，dBN 与 dBGN 方法能够高效地获得精确结果，并在所选算例中优于 BFGS 方法。

相关内容

MASS

关注 0

MASS：IEEE International Conference on Mobile Ad-hoc and Sensor Systems。 Explanation：移动Ad hoc和传感器系统IEEE国际会议。 Publisher：IEEE。 SIT： http://dblp.uni-trier.de/db/conf/mass/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日