We develop an analytical framework to characterize the set of optimal ReLU neural networks by reformulating the non-convex training problem as a convex program. We show that the global optima of the convex parameterization are given by a polyhedral set and then extend this characterization to the optimal set of the non-convex training objective. Since all stationary points of the ReLU training problem can be represented as optima of sub-sampled convex programs, our work provides a general expression for all critical points of the non-convex objective. We then leverage our results to provide an optimal pruning algorithm for computing minimal networks, establish conditions for the regularization path of ReLU networks to be continuous, and develop sensitivity results for minimal ReLU networks.
翻译:我们构建了一个分析框架,通过将非凸训练问题重新表述为凸规划,来刻画最优ReLU神经网络解集的特征。我们证明了凸参数化的全局最优解由多面体集给出,并将该刻画推广至非凸训练目标的最优解集。由于ReLU训练问题的所有驻点均可表示为子采样凸规划的最优解,我们的工作为非凸目标的所有临界点提供了通用表达式。进而利用这些结果,提出了用于计算最小化网络的最优剪枝算法,建立了ReLU网络正则化路径连续性的条件,并推导了最小化ReLU网络的敏感性分析结果。