We propose a new bound for generalization of neural networks using Koopman operators. Whereas most of existing works focus on low-rank weight matrices, we focus on full-rank weight matrices. Our bound is tighter than existing norm-based bounds when the condition numbers of weight matrices are small. Especially, it is completely independent of the width of the network if the weight matrices are orthogonal. Our bound does not contradict to the existing bounds but is a complement to the existing bounds. As supported by several existing empirical results, low-rankness is not the only reason for generalization. Furthermore, our bound can be combined with the existing bounds to obtain a tighter bound. Our result sheds new light on understanding generalization of neural networks with full-rank weight matrices, and it provides a connection between operator-theoretic analysis and generalization of neural networks.
翻译:我们提出一种利用库普曼算子推导神经网络泛化界的新方法。现有研究多关注低秩权重矩阵,而本文聚焦于满秩权重矩阵。当权重矩阵的条件数较小时,我们所提出的界比现有的基于范数的界更紧。特别地,当权重矩阵为正交矩阵时,该界完全独立于网络宽度。该界不仅不与现有结论相矛盾,反而是对现有界的补充。多项现有实证结果表明,低秩性并非泛化的唯一原因。此外,该界可与现有界结合以获得更紧的界。本工作为理解含满秩权重矩阵的神经网络泛化性提供了新视角,并建立了算子理论分析与神经网络泛化性之间的桥梁。