In this work, we present a variety of novel information-theoretic generalization bounds for learning algorithms, from the supersample setting of Steinke & Zakynthinou (2020)-the setting of the "conditional mutual information" framework. Our development exploits projecting the loss pair (obtained from a training instance and a testing instance) down to a single number and correlating loss values with a Rademacher sequence (and its shifted variants). The presented bounds include square-root bounds, fast-rate bounds, including those based on variance and sharpness, and bounds for interpolating algorithms etc. We show theoretically or empirically that these bounds are tighter than all information-theoretic bounds known to date on the same supersample setting.
翻译:本文提出了一系列适用于学习算法的新型信息论泛化上界,这些上界基于Steinke & Zakynthinou (2020)提出的超样本框架——即"条件互信息"范式的设定。我们的方法通过将(训练实例与测试实例构成的)损失对投影为单一数值,并将损失值与Rademacher序列(及其平移变体)建立关联来展开分析。所提出的上界包括平方根型上界、快速率型上界(含基于方差与锐度的方法),以及插值算法专用上界等。理论分析与实证结果均表明,在相同超样本设定下,这些上界比现有所有信息论方法推导出的上界更为紧致。