Much of learning theory is concerned with the design and analysis of probably approximately correct (PAC) learners. The closely related transductive model of learning has recently seen more scrutiny, with its learners often used as precursors to PAC learners. Our goal in this work is to understand and quantify the exact relationship between these two models. First, we observe that modest extensions of existing results show the models to be essentially equivalent for realizable learning for most natural loss functions, up to low order terms in the error and sample complexity. The situation for agnostic learning appears less straightforward, with sample complexities potentially separated by a $\frac{1}{\epsilon}$ factor. This is therefore where our main contributions lie. Our results are two-fold: 1. For agnostic learning with bounded losses (including, for example, multiclass classification), we show that PAC learning reduces to transductive learning at the cost of low-order terms in the error and sample complexity via an adaptation of the reduction of arXiv:2304.09167 to the agnostic setting. 2. For agnostic binary classification, we show the converse: transductive learning is essentially no more difficult than PAC learning. Together with our first result this implies that the PAC and transductive models are essentially equivalent for agnostic binary classification. This is our most technical result, and involves two steps: A symmetrization argument on the agnostic one-inclusion graph (OIG) of arXiv:2309.13692 to derive the worst-case agnostic transductive instance, and expressing the error of the agnostic OIG algorithm for this instance in terms of the empirical Rademacher complexity of the class. We leave as an intriguing open question whether our second result can be extended beyond binary classification to show the transductive and PAC models equivalent more broadly.
翻译:学习理论主要关注可能近似正确(PAC)学习器的设计与分析。近年来,与之密切相关的归纳学习模型受到更多审视,其学习器常被用作PAC学习器的前驱。本文的目标是理解并量化这两种模型之间的确切关系。首先,我们通过对现有结果的适度扩展指出,对于大多数自然损失函数的可实现学习,两种模型在误差和样本复杂度上仅存在低阶项差异,因而基本等价。而对于不可知学习,情况则显得更为复杂,其样本复杂度可能存在$\frac{1}{\epsilon}$倍数的分离。因此,我们的主要贡献集中于此。我们的研究结果包含两方面:1. 对于有界损失(例如多类分类)的不可知学习,我们通过将arXiv:2304.09167的归约方法适配到不可知场景,证明PAC学习可归约为归纳学习,且仅在误差和样本复杂度上引入低阶项代价。2. 对于不可知二分类问题,我们证明了相反的结论:归纳学习本质上并不比PAC学习更困难。结合第一项结果,这意味着对于不可知二分类,PAC模型与归纳模型基本等价。这是本文技术性最强的结论,其推导包含两个步骤:首先对arXiv:2309.13692中不可知单包含图(OIG)进行对称化论证,以推导最坏情况下的不可知归纳实例;然后将该实例上不可知OIG算法的误差表示为类别经验Rademacher复杂度的函数。我们留下一个值得深入探讨的开放问题:第二项结论能否超越二分类推广到更广泛场景,从而证明归纳模型与PAC模型在更普遍意义上等价。