Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. The predominant approach is to alter the supervised learning pipeline by augmenting typical loss functions, letting model rejection incur a lower loss than an incorrect prediction. Instead, we propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. This can be formalized via the optimization of a loss's risk with a $ \phi$-divergence regularization term. Through this idealized distribution, a rejection decision can be made by utilizing the density ratio between this distribution and the data distribution. We focus on the setting where our $ \phi $-divergences are specified by the family of $ \alpha $-divergence. Our framework is tested empirically over clean and noisy datasets.
翻译:分类拒识作为一种学习范式,允许模型在无法做出可靠预测时主动放弃预测。主流方法通常通过增强典型损失函数来改变监督学习流程,使得模型拒识产生的损失低于错误预测。与此不同,本文提出一种全新的分布视角:我们致力于寻找能够最大化预训练模型性能的理想化数据分布。该问题可通过优化损失风险并结合$ \phi $散度正则化项进行形式化表述。借助此理想分布,可通过计算该分布与原始数据分布之间的密度比来执行拒识决策。本研究重点关注$ \phi $散度由$ \alpha $散度族指定的设定。我们在洁净与含噪数据集上对所提框架进行了实证检验。