Stability, Complexity and Data-Dependent Worst-Case Generalization Bounds

Providing generalization guarantees for stochastic optimization algorithms remains a key challenge in learning theory. Recently, numerous works demonstrated the impact of the geometric properties of optimization trajectories on generalization performance. These works propose worst-case generalization bounds in terms of various notions of intrinsic dimension and/or topological complexity, which were found to empirically correlate with the generalization error. However, most of these approaches involve intractable mutual information terms, which limit a full understanding of the bounds. In contrast, some authors built on algorithmic stability to obtain worst-case bounds involving geometric quantities of a combinatorial nature, which are impractical to compute. In this paper, we address these limitations by combining empirically relevant complexity measures with a framework that avoids intractable quantities. To this end, we introduce the concept of \emph{random set stability}, tailored for the data-dependent random sets produced by stochastic optimization algorithms. Within this framework, we show that the worst-case generalization error can be bounded in terms of (i) the random set stability parameter and (ii) empirically relevant, data- and algorithm-dependent complexity measures of the random set. Moreover, our framework improves existing topological generalization bounds by recovering previous complexity notions without relying on mutual information terms. Through a series of experiments in practically relevant settings, we validate our theory by evaluating the tightness of our bounds and the interplay between topological complexity and stability.

翻译：为随机优化算法提供泛化保证仍然是学习理论中的一个关键挑战。近期，大量研究表明优化轨迹的几何性质对泛化性能具有重要影响。这些工作提出了基于各种内在维度和/或拓扑复杂度概念的最坏情况泛化界，这些界被发现在经验上与泛化误差相关。然而，大多数方法涉及难以处理的互信息项，这限制了对这些界的完整理解。相比之下，部分研究者基于算法稳定性获得了涉及组合性质几何量的最坏情况界，但这些几何量在实际中难以计算。本文通过将经验相关的复杂度度量与避免难处理量的框架相结合，解决了这些局限性。为此，我们引入了**随机集稳定性**的概念，该概念专门针对随机优化算法产生的数据依赖随机集而设计。在此框架下，我们证明最坏情况泛化误差可由以下两项界定：(i) 随机集稳定性参数；(ii) 经验相关、数据与算法依赖的随机集复杂度度量。此外，我们的框架通过在不依赖互信息项的情况下恢复先前的复杂度概念，改进了现有的拓扑泛化界。通过在实际相关场景中进行的一系列实验，我们通过评估所提出界的紧致性以及拓扑复杂度与稳定性之间的相互作用，验证了理论的有效性。