We study the key framework of learning with abstention in the multi-class classification setting. In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost. We present a series of new theoretical and algorithmic results for this learning problem in the predictor-rejector framework. We introduce several new families of surrogate losses for which we prove strong non-asymptotic and hypothesis set-specific consistency guarantees, thereby resolving positively two existing open questions. These guarantees provide upper bounds on the estimation error of the abstention loss function in terms of that of the surrogate loss. We analyze both a single-stage setting where the predictor and rejector are learned simultaneously and a two-stage setting crucial in applications, where the predictor is learned in a first stage using a standard surrogate loss such as cross-entropy. These guarantees suggest new multi-class abstention algorithms based on minimizing these surrogate losses. We also report the results of extensive experiments comparing these algorithms to the current state-of-the-art algorithms on CIFAR-10, CIFAR-100 and SVHN datasets. Our results demonstrate empirically the benefit of our new surrogate losses and show the remarkable performance of our broadly applicable two-stage abstention algorithm.
翻译:我们研究了多类分类设置中带弃权学习的关键框架。在该设置中,学习器可以选择以预定义成本放弃做出预测。针对预测-拒绝框架下的这一学习问题,我们提出了一系列新的理论与算法结果。我们引入了若干新的替代损失函数族,并证明了其强非渐近性和假设集特定的一致性保证,从而正面解决了两个现有的开放问题。这些保证通过替代损失函数给出了弃权损失函数估计误差的上界。我们分析了两种设置:单阶段设置(预测器和拒绝器同时学习)与在应用中至关重要的两阶段设置(预测器在第一阶段使用交叉熵等标准替代损失函数进行学习)。这些保证催生了基于最小化这些替代损失函数的新的多类弃权算法。我们还报告了在CIFAR-10、CIFAR-100和SVHN数据集上将这些算法与当前最先进算法进行广泛实验的结果。实验结果实证地证明了我们新替代损失函数的优势,并展示了广泛适用的两阶段弃权算法的卓越性能。