Deep learning has been widely used in many fields, but the model training process usually consumes massive computational resources and time. Therefore, designing an efficient neural network training method with a provable convergence guarantee is a fundamental and important research question. In this paper, we present a static half-space report data structure that consists of a fully connected two-layer neural network for shifted ReLU activation to enable activated neuron identification in sublinear time via geometric search. We also prove that our algorithm can converge in $O(M^2/\epsilon^2)$ time with network size quadratic in the coefficient norm upper bound $M$ and error term $\epsilon$.
翻译:深度学习已广泛应用于众多领域,但模型训练过程通常消耗大量计算资源和时间。因此,设计一个具有可证明收敛保证的高效神经网络训练方法是一个基础且重要的研究课题。本文提出一种静态半空间报告数据结构,该结构包含一个用于偏移ReLU激活的全连接双层神经网络,从而通过几何搜索实现次线性时间内的激活神经元识别。我们还证明,我们的算法能够在$O(M^2/\epsilon^2)$时间内收敛,其中网络规模与系数范数上界$M$和误差项$\epsilon$呈二次关系。