A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of generalization. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.
翻译:理论机器学习中的一个基本问题是泛化。过去几十年间,PAC-Bayesian方法已被确立为一个灵活框架,用于分析机器学习算法的泛化能力并设计新算法。近年来,由于其对包括深度神经网络在内的多种学习算法的潜在适用性,该方法引起了更多关注。与此同时,泛化的信息论视角也得到了发展,其中建立了泛化与各种信息度量之间的关系。该框架与PAC-Bayesian方法密切相关,且两个领域独立发现了大量结果。本专著作重强调这一强关联,并提出泛化的统一处理方法。我们介绍两个视角共有的技术与成果,并讨论存在分歧的途径与解释。特别地,我们展示了该领域许多证明如何共享模块化结构,从而可直观理解其核心思想。我们重点关注条件互信息(CMI)框架、学习算法信息复杂性的分析研究,以及所提方法在深度学习中的应用。本专著旨在全面介绍信息论泛化界及其与PAC-Bayes的关联,为理解最新进展奠定基础,主要面向对泛化与理论机器学习感兴趣的研究人员。