We prove that overparametrized neural networks are able to generalize with a test error that is independent of the level of overparametrization, and independent of the Vapnik-Chervonenkis (VC) dimension. We prove explicit bounds that only depend on the metric geometry of the test and training sets, on the regularity properties of the activation function, and on the operator norms of the weights and norms of biases. For overparametrized deep ReLU networks with a training sample size bounded by the input space dimension, we explicitly construct zero loss minimizers without use of gradient descent, and prove a uniform generalization bound that is independent of the network architecture. We perform computational experiments of our theoretical results with MNIST, and obtain agreement with the true test error within a 22 % margin on average.
翻译:我们证明了过参数化神经网络能够实现泛化,其测试误差与过参数化程度无关,且与Vapnik-Chervonenkis(VC)维数无关。我们证明了仅依赖于测试集与训练集的度量几何性质、激活函数的正则特性、以及权重算子范数与偏置范数的显式界。对于训练样本量受输入空间维度限制的过参数化深度ReLU网络,我们显式构造了无需梯度下降的零损失最小化解,并证明了与网络架构无关的一致泛化界。我们在MNIST数据集上对理论结果进行了计算实验,所得结果与真实测试误差的平均偏差在22%以内。