We present a simple linear regression based approach for learning the weights and biases of a neural network, as an alternative to standard gradient based backpropagation. The present work is exploratory in nature, and we restrict the description and experiments to (i) simple feedforward neural networks, (ii) scalar (single output) regression problems, and (iii) invertible activation functions. However, the approach is intended to be extensible to larger, more complex architectures. The key idea is the observation that the input to every neuron in a neural network is a linear combination of the activations of neurons in the previous layer, as well as the parameters (weights and biases) of the layer. If we are able to compute the ideal total input values to every neuron by working backwards from the output, we can formulate the learning problem as a linear least squares problem which iterates between updating the parameters and the activation values. We present an explicit algorithm that implements this idea, and we show that (at least for small problems) the approach is more stable and faster than gradient-based methods.
翻译:我们提出了一种基于简单线性回归的神经网络权重与偏置学习新方法,作为标准梯度反向传播的替代方案。本工作具有探索性质,描述与实验仅限于:(i) 简单前馈神经网络,(ii) 标量(单输出)回归问题,以及 (iii) 可逆激活函数。然而,该方法旨在扩展至更大规模、更复杂的架构。其核心思想源于一个观察:神经网络中每个神经元的输入,均由前一层神经元的激活值以及本层参数(权重和偏置)线性组合而成。若我们能从输出端反向推导出每个神经元的理想总输入值,则可将学习问题转化为一个线性最小二乘问题,通过迭代更新参数与激活值进行求解。我们提出了实现该思想的显式算法,并且证明(至少在简单问题上)该方法比梯度方法更稳定且收敛速度更快。