We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. We derive time-uniform generalizations of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds and, more importantly, general techniques for constructing them. Despite being anytime-valid, our extensions remain as tight as their fixed-time counterparts. Moreover, they enable us to relax traditional assumptions; in particular, we consider nonstationary loss functions and non-i.i.d. data. In sum, we unify the derivation of past bounds and ease the search for future bounds: one may simply check if our supermartingale or submartingale conditions are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.
翻译:我们提出一个统一的框架,用于推导PAC-Bayesian泛化界。与以往大多数相关文献不同,本文的界具有任意时刻有效性(即时间一致性),意味着它们适用于所有停时,而不仅限于固定样本量。该方法按以下顺序结合四种工具:(a)非负上鞅或逆下鞅,(b)混合方法,(c)Donsker-Varadhan公式(或其他凸对偶原理),以及(d)Ville不等式。我们推导了经典PAC-Bayes界(如Seeger、McAllester、Maurer和Catoni的界)以及许多近期界的时间一致推广。此外,我们提出了若干新界,并——更重要的是——提出了构造这些界的一般性技术。尽管具有任意时刻有效性,我们的扩展仍与其固定时间对应界一样紧致。此外,这些扩展使我们能够放宽传统假设;特别地,我们考虑了非平稳损失函数和非独立同分布数据。总之,我们统一了以往界的推导过程,并简化了未来界的寻找:只需检验我们的上鞅或下鞅条件是否满足,若满足,即可保证得到一个(时间一致的)PAC-Bayes界。