Numerous generalization bounds have been proposed in the literature as potential explanations for the ability of neural networks to generalize in the overparameterized setting. However, none of these bounds are tight. For instance, in their paper ``Fantastic Generalization Measures and Where to Find Them'', Jiang et al. (2020) examine more than a dozen generalization bounds, and show empirically that none of them imply guarantees that can explain the remarkable performance of neural networks. This raises the question of whether tight generalization bounds are at all possible. We consider two types of generalization bounds common in the literature: (1) bounds that depend on the training set and the output of the learning algorithm. There are multiple bounds of this type in the literature (e.g., norm-based and margin-based bounds), but we prove mathematically that no such bound can be uniformly tight in the overparameterized setting; (2) bounds that depend on the training set and on the learning algorithm (e.g., stability bounds). For these bounds, we show a trade-off between the algorithm's performance and the bound's tightness. Namely, if the algorithm achieves good accuracy on certain distributions in the overparameterized setting, then no generalization bound can be tight for it. We conclude that generalization bounds in the overparameterized setting cannot be tight without suitable assumptions on the population distribution.
翻译:文献中提出了许多泛化界,作为对神经网络在过参数化设置下泛化能力的潜在解释。然而,这些界均非紧致。例如,在其论文《泛化度量的奇妙之处及其寻找方法》中,Jiang等人(2020)考察了十余个泛化界,并通过实验表明,这些界均无法提供足以解释神经网络卓越性能的保证。这引发了一个问题:是否存在紧致的泛化界?我们考虑文献中常见的两类泛化界:(1)依赖于训练集和学习算法输出的界。此类界在文献中有多个实例(如基于范数和基于间隔的界),但我们从数学上证明,在过参数化设置下,没有任何这类界能够一致紧致;(2)依赖于训练集和学习算法的界(如稳定性界)。对于这类界,我们展示了算法性能与界紧致性之间的权衡。具体而言,如果算法在过参数化设置下对某些分布能实现高精度,则不存在任何关于该算法的紧致泛化界。我们得出结论:若不对总体分布施加适当假设,过参数化设置下的泛化界不可能紧致。