This work addresses weight optimization problem for fully-connected feed-forward neural networks. Unlike existing approaches that are based on back-propagation (BP) and chain rule gradient-based optimization (which implies iterative execution, potentially burdensome and time-consuming in some cases), the proposed approach offers the solution for weight optimization in closed-form by means of least squares (LS) methodology. In the case where the input-to-output mapping is injective, the new approach optimizes the weights in a back-propagating fashion in a single iteration by jointly optimizing a set of weights in each layer for each neuron. In the case where the input-to-output mapping is not injective (e.g., in classification problems), the proposed solution is easily adapted to obtain its final solution in a few iterations. An important advantage over the existing solutions is that these computations (for all neurons in a layer) are independent from each other; thus, they can be carried out in parallel to optimize all weights in a given layer simultaneously. Furthermore, its running time is deterministic in the sense that one can obtain the exact number of computations necessary to optimize the weights in all network layers (per iteration, in the case of non-injective mapping). Our simulation and empirical results show that the proposed scheme, BPLS, works well and is competitive with existing ones in terms of accuracy, but significantly surpasses them in terms of running time. To summarize, the new method is straightforward to implement, is competitive and computationally more efficient than the existing ones, and is well-tailored for parallel implementation.
翻译:本文针对全连接前馈神经网络的权重优化问题提出了解决方案。与现有基于反向传播(BP)和链式法则梯度优化(需迭代执行,某些情况下可能繁琐且耗时)的方法不同,本方法采用最小二乘(LS)法提供权重优化的闭式解。在输入-输出映射为单射的情形下,新方法通过同时优化每层每个神经元的权重集,以反向传播方式单次迭代完成优化。当输入-输出映射非单射时(如分类问题),所提方案可轻松调整,在少数迭代后获得最终解。相比现有方案,其重要优势在于:各层神经元的计算相互独立,因此可并行执行以同步优化给定层的所有权重。此外,其运行时间具有确定性,即能精确计算出优化所有网络层权重所需的计算次数(非单射映射情形下为每次迭代所需计算量)。仿真与实证结果表明,所提方案BPLS在精度上与现有方法相当,但在运行时间上显著超越。总之,新方法实现简单、性能具竞争力且计算效率更高,并特别适用于并行实现。