The optimal prediction strategy for out-of-distribution (OOD) setups is a fundamental question in machine learning. In this paper, we address this question and present several contributions. We propose three reject option models for OOD setups: the Cost-based model, the Bounded TPR-FPR model, and the Bounded Precision-Recall model. These models extend the standard reject option models used in non-OOD setups and define the notion of an optimal OOD selective classifier. We establish that all the proposed models, despite their different formulations, share a common class of optimal strategies. Motivated by the optimal strategy, we introduce double-score OOD methods that leverage uncertainty scores from two chosen OOD detectors: one focused on OOD/ID discrimination and the other on misclassification detection. The experimental results consistently demonstrate the superior performance of this simple strategy compared to state-of-the-art methods. Additionally, we propose novel evaluation metrics derived from the definition of the optimal strategy under the proposed OOD rejection models. These new metrics provide a comprehensive and reliable assessment of OOD methods without the deficiencies observed in existing evaluation approaches.
翻译:分布外设置中的最优预测策略是机器学习中的一个基础问题。本文探讨了该问题并提出了多项贡献。我们针对分布外设置提出了三种拒绝选项模型:基于成本的模型、有界真阳性率-假阳性率模型以及有界精确率-召回率模型。这些模型扩展了非分布外设置中使用的标准拒绝选项模型,并定义了最优分布外选择性分类器的概念。我们证明,尽管所有提出的模型在表述上存在差异,但它们共享一类共同的最优策略。受最优策略启发,我们引入了双评分分布外检测方法,该方法利用两个选定分布外检测器的不确定性分数:一个专注于分布外/分布内判别,另一个专注于误分类检测。实验结果一致表明,这一简单策略相较于现有最优方法具有更优性能。此外,我们基于所提出的分布外拒绝模型下的最优策略定义,提出了新的评估指标。这些新指标能够全面且可靠地评估分布外方法,避免了现有评估方法的缺陷。