In this work, we present a variety of novel information-theoretic generalization bounds for learning algorithms, from the supersample setting of Steinke & Zakynthinou (2020)-the setting of the "conditional mutual information" framework. Our development exploits projecting the loss pair (obtained from a training instance and a testing instance) down to a single number and correlating loss values with a Rademacher sequence (and its shifted variants). The presented bounds include square-root bounds, fast-rate bounds, including those based on variance and sharpness, and bounds for interpolating algorithms etc. We show theoretically or empirically that these bounds are tighter than all information-theoretic bounds known to date on the same supersample setting.
翻译:本文中,我们从Steinke & Zakynthinou(2020)的超样本设定——即“条件互信息”框架的设定出发,提出了多种新的用于学习算法的信息论泛化界。我们的推导过程利用了将(从训练实例和测试实例获得的)损失对投影为单一数值的方法,并将损失值与Rademacher序列(及其平移变体)相关联。所提出的界包括平方根界、快速收敛界(包括基于方差和尖锐性的界)、以及适用于插值算法的界等。我们通过理论或实验证明,这些界比同一超样本设定下迄今为止已知的所有信息论泛化界更紧。