In this work, we introduce novel information-theoretic generalization bounds using the conditional $f$-information framework, an extension of the traditional conditional mutual information (MI) framework. We provide a generic approach to derive generalization bounds via $f$-information in the supersample setting, applicable to both bounded and unbounded loss functions. Unlike previous MI-based bounds, our proof strategy does not rely on upper bounding the cumulant-generating function (CGF) in the variational formula of MI. Instead, we set the CGF or its upper bound to zero by carefully selecting the measurable function invoked in the variational formula. Although some of our techniques are partially inspired by recent advances in the coin-betting framework (e.g., Jang et al. (2023)), our results are independent of any previous findings from regret guarantees of online gambling algorithms. Additionally, our newly derived MI-based bound recovers many previous results and improves our understanding of their potential limitations. Finally, we empirically compare various $f$-information measures for generalization, demonstrating the improvement of our new bounds over the previous bounds.
翻译:本文通过条件$f$信息框架——传统条件互信息框架的扩展——提出了新颖的信息论泛化界。我们提供了一种通用方法,在超样本设定下通过$f$信息推导泛化界,该方法适用于有界和无界损失函数。与以往基于互信息的界不同,我们的证明策略不依赖于在互信息的变分公式中对累积生成函数进行上界估计。相反,我们通过精心选择变分公式中调用的可测函数,将累积生成函数或其上界设为零。尽管我们的部分技术在一定程度上受到近期硬币投注框架进展的启发,但所得结果独立于在线赌博算法遗憾保证的任何先前结论。此外,我们新推导的基于互信息的界复现了多项先前结果,并深化了对其潜在局限性的理解。最后,我们通过实验比较了多种用于泛化的$f$信息度量,证明了新界相较于先前界的改进。