On the tightness of information-theoretic bounds on generalization error of learning algorithms

A recent line of works, initiated by Russo and Xu, has shown that the generalization error of a learning algorithm can be upper bounded by information measures. In most of the relevant works, the convergence rate of the expected generalization error is in the form of $O(\sqrt{\lambda/n})$ where $\lambda$ is some information-theoretic quantities such as the mutual information or conditional mutual information between the data and the learned hypothesis. However, such a learning rate is typically considered to be ``slow", compared to a ``fast rate" of $O(\lambda/n)$ in many learning scenarios. In this work, we first show that the square root does not necessarily imply a slow rate, and a fast rate result can still be obtained using this bound under appropriate assumptions. Furthermore, we identify the critical conditions needed for the fast rate generalization error, which we call the $(\eta,c)$-central condition. Under this condition, we give information-theoretic bounds on the generalization error and excess risk, with a fast convergence rate for specific learning algorithms such as empirical risk minimization and its regularized version. Finally, several analytical examples are given to show the effectiveness of the bounds.

翻译：由Russo和Xu开创的一系列近期研究表明，学习算法的泛化误差可以通过信息度量来上界。在大多数相关工作中，期望泛化误差的收敛速率为$O(\sqrt{\lambda/n})$的形式，其中$\lambda$是某些信息论量，如数据与学习到的假设之间的互信息或条件互信息。然而，在许多学习场景中，这种学习率通常被认为是“慢速”的，相比之下$O(\lambda/n)$则是“快速率”。在本文中，我们首先说明平方根并不一定意味着慢速率，并且在该界下通过适当的假设仍可获得快速率结果。此外，我们识别出快速率泛化误差所需的关键条件，称为$(\eta,c)$-中心条件。在此条件下，我们给出了泛化误差和超额风险的信息论界，针对特定学习算法（如经验风险最小化及其正则化版本）实现了快速收敛率。最后，通过若干分析示例展示了这些界的有效性。

相关内容

泛化误差

关注 107

学习方法的泛化能力（Generalization Error）是由该方法学习到的模型对未知数据的预测能力，是学习方法本质上重要的性质。现实中采用最多的办法是通过测试泛化误差来评价学习方法的泛化能力。泛化误差界刻画了学习算法的经验风险与期望风险之间偏差和收敛速度。一个机器学习的泛化误差（Generalization Error），是一个描述学生机器在从样品数据中学习之后，离教师机器之间的差距的函数。

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

专知会员服务

49+阅读 · 2022年2月19日

【经典书】主动学习理论，226页pdf，Theory of Active Learning

专知会员服务

129+阅读 · 2021年7月14日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日