In this paper, we investigate the complexity of feed-forward neural networks by examining the concept of functional equivalence, which suggests that different network parameterizations can lead to the same function. We utilize the permutation invariance property to derive a novel covering number bound for the class of feedforward neural networks, which reveals that the complexity of a neural network can be reduced by exploiting this property. Furthermore, based on the symmetric structure of parameter space, we demonstrate that an appropriate strategy of random parameter initialization can increase the probability of convergence for optimization. We found that overparameterized networks tend to be easier to train in the sense that increasing the width of neural networks leads to a vanishing volume of the effective parameter space. Our findings offer new insights into overparameterization and have significant implications for understanding generalization and optimization in deep learning.
翻译:本文从功能等价概念出发研究前馈神经网络的复杂度,该概念表明不同的网络参数化可能对应相同函数。我们利用置换不变性推导出前馈神经网络类的新型覆盖数界,揭示通过利用该特性可降低神经网络的复杂度。进一步地,基于参数空间的对称结构,我们证明合理的随机参数初始化策略能够提升优化的收敛概率。研究发现,过参数化网络更易于训练,因为增加网络宽度会使有效参数空间的体积趋于零。本研究为过参数化现象提供了新见解,对理解深度学习中的泛化与优化具有重要启示。