In this paper, we investigate the complexity of feed-forward neural networks by examining the concept of functional equivalence, which suggests that different network parameterizations can lead to the same function. We utilize the permutation invariance property to derive a novel covering number bound for the class of feedforward neural networks, which reveals that the complexity of a neural network can be reduced by exploiting this property. We discuss the extensions to convolutional neural networks, residual networks, and attention-based models. We demonstrate that functional equivalence benefits optimization, as overparameterized networks tend to be easier to train since increasing network width leads to a diminishing volume of the effective parameter space. Our findings offer new insights into overparameterization and have significant implications for understanding generalization and optimization in deep learning.
翻译:本文通过研究功能等价概念来探讨前馈神经网络的复杂度,该概念表明不同的网络参数化可能对应相同的函数表达。我们利用排列不变性推导出前馈神经网络类的新覆盖数上界,揭示通过利用该性质可降低神经网络的复杂度。我们进一步讨论了该理论在卷积神经网络、残差网络和基于注意力机制模型中的扩展应用。研究表明,功能等价性有利于优化过程:由于增加网络宽度会导致有效参数空间的体积缩减,过参数化网络往往更易训练。我们的发现为过参数化现象提供了新视角,对理解深度学习中的泛化性与优化机制具有重要意义。