It is frequently observed that overparameterized neural networks generalize well. Regarding such phenomena, existing theoretical work mainly devotes to linear settings or fully-connected neural networks. This paper studies the learning ability of an important family of deep neural networks, deep convolutional neural networks (DCNNs), under both underparameterized and overparameterized settings. We establish the first learning rates of underparameterized DCNNs without parameter or function variable structure restrictions presented in the literature. We also show that by adding well-defined layers to a non-interpolating DCNN, we can obtain some interpolating DCNNs that maintain the good learning rates of the non-interpolating DCNN. This result is achieved by a novel network deepening scheme designed for DCNNs. Our work provides theoretical verification of how overfitted DCNNs generalize well.
翻译:过度参数化的神经网络通常具有良好的泛化能力是常见的观察结果。针对此类现象,现有理论工作主要集中于线性设置或全连接神经网络。本文研究了深度卷积神经网络(DCNN)这一重要深度神经网络家族在欠参数化和过参数化两种设置下的学习能力。我们首次建立了文献中未涉及参数或函数变量结构限制的欠参数化DCNN的学习率。我们还证明,通过向非插值DCNN添加定义良好的层,可以获得保持非插值DCNN良好学习率的插值DCNN。这一结果是通过一种专为DCNN设计的新型网络深化方案实现的。我们的工作为过拟合DCNN如何实现良好泛化提供了理论验证。