Training of convolutional neural networks is a high dimensional and a non-convex optimization problem. At present, it is inefficient in situations where parametric learning rates can not be confidently set. Some past works have introduced Newton methods for training deep neural networks. Newton methods for convolutional neural networks involve complicated operations. Finding the Hessian matrix in second-order methods becomes very complex as we mainly use the finite differences method with the image data. Newton methods for convolutional neural networks deals with this by using the sub-sampled Hessian Newton methods. In this paper, we have used the complete data instead of the sub-sampled methods that only handle partial data at a time. Further, we have used parallel processing instead of serial processing in mini-batch computations. The results obtained using parallel processing in this study, outperform the time taken by the previous approach.
翻译:训练卷积神经网络是一个高维非凸优化问题。目前,在无法可靠设定参数化学习率的情况下,训练效率低下。过往一些研究引入了牛顿方法来训练深度神经网络。针对卷积神经网络的牛顿方法涉及复杂运算。二阶方法中寻找海森矩阵的过程变得极其复杂,因为我们主要采用图像数据的有限差分法。卷积神经网络的牛顿方法通过使用子采样牛顿方法来解决这一问题。在本文中,我们使用了完整数据而非仅处理部分数据的子采样方法。此外,我们在小批量计算中采用了并行处理而非串行处理。本研究中通过并行处理获得的结果,其耗时优于先前方法。