Weight sharing, equivariance, and local filters, as in convolutional neural networks, are believed to contribute to the sample efficiency of neural networks. However, it is not clear how each one of these design choices contribute to the generalization error. Through the lens of statistical learning theory, we aim to provide an insight into this question by characterizing the relative impact of each choice on the sample complexity. We obtain lower and upper sample complexity bounds for a class of single hidden layer networks. It is shown that the gain of equivariance is directly manifested in the bound, while getting a similar increase for weight sharing depends on the sharing mechanism. Among our results, we obtain a completely dimension-free bound for equivariant networks for a class of pooling operations. We show that the bound depends merely on the norm of filters, which is tighter than using the spectral norm of the respective matrix. We also characterize the trade-off in sample complexity between the parametrization of filters in spatial and frequency domains, particularly when spatial filters are localized as in vanilla convolutional neural networks.
翻译:权重共享、等变性和局部滤波器(如卷积神经网络中所采用的设计)被认为有助于提升神经网络的样本效率。然而,这些设计选择各自如何影响泛化误差尚不明确。通过统计学习理论的视角,我们旨在通过刻画每种选择对样本复杂度的相对影响,为这一问题提供见解。我们针对一类单隐藏层网络获得了样本复杂度的下界与上界。结果表明,等变性带来的增益直接体现在界中,而权重共享能否带来类似的提升则取决于共享机制。在我们的结果中,我们针对一类池化操作,为等变网络得到了完全与维度无关的界。我们证明该界仅依赖于滤波器的范数,这比使用相应矩阵的谱范数更为紧致。我们还刻画了滤波器在空间域与频域参数化之间的样本复杂度权衡,特别是当空间滤波器如经典卷积神经网络中那样具有局部性时。