We study the loss landscape of both shallow and deep, mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. We show both by count and volume that most activation patterns correspond to parameter regions with no bad local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank Jacobian to many regions having deficient rank depending on the amount of overparameterization.
翻译:我们针对平方误差损失,研究了在一般有限输入数据集上,浅层和深层轻度过参数化ReLU神经网络的损失景观。通过计数和体积两种方式证明,大多数激活模式对应的参数区域不存在不良局部极小值。此外,对于一维输入数据,我们证明了网络可实现的大多数激活区域包含一个全局极小值的高维集合,且无不良局部极小值。我们通过实验验证了这些结果,发现根据过参数化程度,存在一个从大多数区域具有满秩雅可比矩阵到许多区域具有秩亏的相变现象。