A central challenge in understanding generalization is to obtain non-vacuous guarantees that go beyond worst-case complexity over data or weight space. Among existing approaches, PAC-Bayes bounds stand out as they can provide tight, data-dependent guarantees even for large networks. However, in ReLU networks, rescaling invariances mean that different weight distributions can represent the same function while leading to arbitrarily different PAC-Bayes complexities. We propose to study PAC-Bayes bounds in an invariant, lifted representation that resolves this discrepancy. This paper explores both the guarantees provided by this approach (invariance, tighter bounds via data processing) and the algorithmic aspects of KL-based rescaling-invariant PAC-Bayes bounds.
翻译:理解泛化的核心挑战在于获得超越数据或权重空间最坏情况复杂度的非平凡保证。在现有方法中,PAC-Bayes边界因其能为大型网络提供紧致的、数据依赖的保证而备受关注。然而,在ReLU网络中,尺度不变性意味着不同的权重分布可以表示相同的函数,却导致PAC-Bayes复杂度产生任意差异。我们提出在不变性的提升表示中研究PAC-Bayes边界,以解决这一矛盾。本文探讨了该方法提供的理论保证(不变性、通过数据处理获得更紧边界)以及基于KL散度的尺度不变PAC-Bayes边界的算法实现。