OpenAUC: Towards AUC-Oriented Open-Set Recognition

Traditional machine learning follows a close-set assumption that the training and test set share the same label space. While in many practical scenarios, it is inevitable that some test samples belong to unknown classes (open-set). To fix this issue, Open-Set Recognition (OSR), whose goal is to make correct predictions on both close-set samples and open-set samples, has attracted rising attention. In this direction, the vast majority of literature focuses on the pattern of open-set samples. However, how to evaluate model performance in this challenging task is still unsolved. In this paper, a systematic analysis reveals that most existing metrics are essentially inconsistent with the aforementioned goal of OSR: (1) For metrics extended from close-set classification, such as Open-set F-score, Youden's index, and Normalized Accuracy, a poor open-set prediction can escape from a low performance score with a superior close-set prediction. (2) Novelty detection AUC, which measures the ranking performance between close-set and open-set samples, ignores the close-set performance. To fix these issues, we propose a novel metric named OpenAUC. Compared with existing metrics, OpenAUC enjoys a concise pairwise formulation that evaluates open-set performance and close-set performance in a coupling manner. Further analysis shows that OpenAUC is free from the aforementioned inconsistency properties. Finally, an end-to-end learning method is proposed to minimize the OpenAUC risk, and the experimental results on popular benchmark datasets speak to its effectiveness. Project Page: https://github.com/wang22ti/OpenAUC.

翻译：传统机器学习遵循闭集假设，即训练集与测试集共享相同的标签空间。然而，在许多实际场景中，部分测试样本不可避免会属于未知类别（开放集）。为解决此问题，开放集识别（OSR）——其目标是对闭集样本和开放集样本均做出正确预测——已引起广泛关注。在该方向上，绝大多数文献聚焦于开放集样本的模式。然而，如何评估模型在此挑战性任务中的性能仍是未解难题。本文通过系统性分析揭示，大多数现有度量标准与上述OSR目标实质上存在不一致性：（1）对于从闭集分类扩展而来的度量标准（如开放集F值、约登指数和归一化准确率），低劣的开放集预测可能因出色的闭集预测而掩盖其低性能得分；（2）新颖性检测AUC（衡量闭集与开放集样本间的排序性能）忽略了闭集性能。针对这些问题，我们提出一种新型度量标准OpenAUC。与现有度量标准相比，OpenAUC具备简洁的成对公式，能以耦合方式评估开放集性能与闭集性能。进一步分析表明，OpenAUC不受前述不一致性属性的影响。最后，我们提出一种端到端学习方法以最小化OpenAUC风险，在主流基准数据集上的实验结果证实了其有效性。项目页面：https://github.com/wang22ti/OpenAUC。