Despite recent advancements in out-of-distribution (OOD) detection, most current studies assume a class-balanced in-distribution training dataset, which is rarely the case in real-world scenarios. This paper addresses the challenging task of long-tailed OOD detection, where the in-distribution data follows a long-tailed class distribution. The main difficulty lies in distinguishing OOD data from samples belonging to the tail classes, as the ability of a classifier to detect OOD instances is not strongly correlated with its accuracy on the in-distribution classes. To overcome this issue, we propose two simple ideas: (1) Expanding the in-distribution class space by introducing multiple abstention classes. This approach allows us to build a detector with clear decision boundaries by training on OOD data using virtual labels. (2) Augmenting the context-limited tail classes by overlaying images onto the context-rich OOD data. This technique encourages the model to pay more attention to the discriminative features of the tail classes. We provide a clue for separating in-distribution and OOD data by analyzing gradient noise. Through extensive experiments, we demonstrate that our method outperforms the current state-of-the-art on various benchmark datasets. Moreover, our method can be used as an add-on for existing long-tail learning approaches, significantly enhancing their OOD detection performance. Code is available at: https://github.com/Stomach-ache/Long-Tailed-OOD-Detection .
翻译:尽管外分布(OOD)检测领域近期取得进展,但大多数现有研究假设训练数据中内分布类别平衡,这在实际场景中鲜少成立。本文针对长尾分布内分布数据下的外分布检测这一挑战性任务展开研究。核心困难在于区分尾类样本与OOD数据——因为分类器检测OOD实例的能力与其在内分布类别上的准确率并不强相关。为攻克该问题,我们提出两种简洁思路:(1) 通过引入多个拒绝类扩展内分布类别空间。该方法利用虚拟标签在OOD数据上训练检测器,从而构建具有清晰决策边界的检测模型。(2) 通过将图像叠加至上下文丰富的OOD数据中,增强上下文受限的尾类。该技术促使模型更关注尾类的判别性特征。我们通过分析梯度噪声为分离内分布与OOD数据提供线索。大量实验表明,我们的方法在多个基准数据集上显著优于当前最先进技术。此外,该方法可作为现有长尾学习方法的即插即用模块,显著提升其OOD检测性能。代码开源于:https://github.com/Stomach-ache/Long-Tailed-OOD-Detection。