The empirical risk minimization (ERM) principle has been highly impactful in machine learning, leading both to near-optimal theoretical guarantees for ERM-based learning algorithms as well as driving many of the recent empirical successes in deep learning. In this paper, we investigate the question of whether the ability to perform ERM, which computes a hypothesis minimizing empirical risk on a given dataset, is necessary for efficient learning: in particular, is there a weaker oracle than ERM which can nevertheless enable learnability? We answer this question affirmatively, showing that in the realizable setting of PAC learning for binary classification, a concept class can be learned using an oracle which only returns a single bit indicating whether a given dataset is realizable by some concept in the class. The sample complexity and oracle complexity of our algorithm depend polynomially on the VC dimension of the hypothesis class, thus showing that there is only a polynomial price to pay for use of our weaker oracle. Our results extend to the agnostic learning setting with a slight strengthening of the oracle, as well as to the partial concept, multiclass and real-valued learning settings. In the setting of partial concept classes, prior to our work no oracle-efficient algorithms were known, even with a standard ERM oracle. Thus, our results address a question of Alon et al. (2021) who asked whether there are algorithmic principles which enable efficient learnability in this setting.
翻译:经验风险最小化(ERM)原则在机器学习领域产生了深远影响,它不仅为基于ERM的学习算法提供了近乎最优的理论保证,也推动了深度学习领域许多最新的实证成功。本文研究的问题是:执行ERM的能力——即计算一个在给定数据集上最小化经验风险的假设——对于高效学习是否必要?具体而言,是否存在比ERM更弱的预言机,却仍能实现可学习性?我们对此问题给出了肯定回答,证明了在二分类PAC学习的可实现性设定中,一个概念类可以通过仅返回单个比特(指示给定数据集是否可由该类中的某个概念实现)的预言机进行学习。我们算法的样本复杂度和预言机复杂度均假设类的VC维度多项式相关,这表明使用我们更弱的预言机仅需付出多项式代价。我们的结果通过略微加强预言机可扩展到不可知学习设定,以及部分概念、多分类和实值学习设定。在部分概念类的设定中,我们的工作之前,即使使用标准的ERM预言机,也未有预言机高效算法为人所知。因此,我们的结果回应了Alon等人(2021)提出的问题,即是否存在能够在此设定中实现高效可学习性的算法原则。