Existing out-of-distribution (OOD) methods have shown great success on balanced datasets but become ineffective in long-tailed recognition (LTR) scenarios where 1) OOD samples are often wrongly classified into head classes and/or 2) tail-class samples are treated as OOD samples. To address these issues, current studies fit a prior distribution of auxiliary/pseudo OOD data to the long-tailed in-distribution (ID) data. However, it is difficult to obtain such an accurate prior distribution given the unknowingness of real OOD samples and heavy class imbalance in LTR. A straightforward solution to avoid the requirement of this prior is to learn an outlier class to encapsulate the OOD samples. The main challenge is then to tackle the aforementioned confusion between OOD samples and head/tail-class samples when learning the outlier class. To this end, we introduce a novel calibrated outlier class learning (COCL) approach, in which 1) a debiased large margin learning method is introduced in the outlier class learning to distinguish OOD samples from both head and tail classes in the representation space and 2) an outlier-class-aware logit calibration method is defined to enhance the long-tailed classification confidence. Extensive empirical results on three popular benchmarks CIFAR10-LT, CIFAR100-LT, and ImageNet-LT demonstrate that COCL substantially outperforms state-of-the-art OOD detection methods in LTR while being able to improve the classification accuracy on ID data. Code is available at https://github.com/mala-lab/COCL.
翻译:现有分布外检测方法在平衡数据集上表现出色,但在长尾识别场景中效果不佳,主要表现为:1)分布外样本常被错误分类到头类;2)尾部类样本被误判为分布外样本。为解决这些问题,当前研究尝试将辅助/伪分布外数据先验分布拟合至长尾分布内数据。然而,由于真实分布外样本的未知性及长尾识别中严重的类别不平衡,获取此类准确先验分布十分困难。避免这一先验需求的直接方案是学习一个异常类来封装分布外样本,其核心挑战在于异常类学习过程中如何解决前述分布外样本与头部/尾部类样本的混淆。为此,我们提出新型标定异常类别学习方法(COCL),该方法通过两项创新实现:1)在异常类学习中引入去偏大间隔学习方法,在表示空间中区分分布外样本与头部/尾部类;2)定义异常类感知的逻辑标定方法以增强长尾分类置信度。在CIFAR10-LT、CIFAR100-LT和ImageNet-LT三个主流基准上的大量实验表明,COCL在长尾识别场景下显著优于最先进的分布外检测方法,同时能提升分布内数据的分类准确率。代码开源在https://github.com/mala-lab/COCL。