The empirical risk minimization (ERM) principle has been highly impactful in machine learning, leading both to near-optimal theoretical guarantees for ERM-based learning algorithms as well as driving many of the recent empirical successes in deep learning. In this paper, we investigate the question of whether the ability to perform ERM, which computes a hypothesis minimizing empirical risk on a given dataset, is necessary for efficient learning: in particular, is there a weaker oracle than ERM which can nevertheless enable learnability? We answer this question affirmatively, showing that in the realizable setting of PAC learning for binary classification, a concept class can be learned using an oracle which only returns a single bit indicating whether a given dataset is realizable by some concept in the class. The sample complexity and oracle complexity of our algorithm depend polynomially on the VC dimension of the hypothesis class, thus showing that there is only a polynomial price to pay for use of our weaker oracle. Our results extend to the agnostic learning setting with a slight strengthening of the oracle, as well as to the partial concept, multiclass and real-valued learning settings. In the setting of partial concept classes, prior to our work no oracle-efficient algorithms were known, even with a standard ERM oracle. Thus, our results address a question of Alon et al. (2021) who asked whether there are algorithmic principles which enable efficient learnability in this setting.
翻译:经验风险最小化(ERM)原则在机器学习领域影响深远,不仅为基于ERM的学习算法提供了近乎最优的理论保证,也推动了深度学习近期诸多实证成功。本文研究执行ERM的能力(即计算在给定数据集上最小化经验风险的假设)是否为高效学习的必要条件:具体而言,是否存在比ERM更弱的预言机仍能实现可学习性?我们对此问题给出肯定回答,证明在二分类PAC学习的可实现性设定中,概念类可以通过仅返回单个比特(指示给定数据集是否可由该类别中的某个概念实现)的预言机进行学习。我们算法的样本复杂度和预言机复杂度均以假设类VC维度的多项式形式依赖,这表明使用我们这种更弱的预言机仅需付出多项式代价。我们的结果通过略微增强预言机可扩展至不可知学习设定,以及部分概念、多分类和实值学习设定。在部分概念类设定中,我们的工作之前尚未知任何预言机高效算法,即使使用标准ERM预言机。因此,我们的结果回应了Alon等人(2021)提出的问题:是否存在能够在此设定中实现高效可学习性的算法原则。