We point out that neural networks are not black boxes, and their generalization stems from the ability to dynamically map a dataset to the extrema of the model function. We further prove that the number of extrema in a neural network is positively correlated with the number of its parameters. We then propose a new algorithm that is significantly different from back-propagation algorithm, which mainly obtains the values of parameters by solving a system of linear equations. Some difficult situations, such as gradient vanishing and overfitting, can be simply explained and dealt with in this framework.
翻译:我们指出神经网络并非黑箱,其泛化能力源于将数据集动态映射至模型函数极值的能力。我们进一步证明神经网络中极值点的数量与其参数数量呈正相关。随后我们提出了一种与反向传播算法显著不同的新算法,该算法主要通过求解线性方程组来获取参数值。在此框架下,梯度消失与过拟合等难题均可得到简明阐释与处理。