Out-of-Distribution (OOD) detection under long-tailed distributions is a highly challenging task because the scarcity of samples in tail classes leads to blurred decision boundaries in the feature space. Current state-of-the-art (sota) methods typically employ Outlier Exposure (OE) strategies, relying on large-scale real external datasets (such as 80 Million Tiny Images) to regularize the feature space. However, this dependence on external data often becomes infeasible in practical deployment due to high data acquisition costs and privacy sensitivity. To this end, we propose a novel data-free framework aimed at completely eliminating reliance on external datasets while maintaining superior detection performance. We introduce a Geometry-guided virtual Outlier Synthesis (GOS) strategy that models statistical properties using the von Mises-Fisher (vMF) distribution on a hypersphere. Specifically, we locate a low-likelihood annulus in the feature space and perform directional sampling of virtual outliers in this region. Simultaneously, we introduce a new Dual-Granularity Semantic Loss (DGS) that utilizes contrastive learning to maximize the distinction between in-distribution (ID) features and these synthesized boundary outliers. Extensive experiments on benchmarks such as CIFAR-LT demonstrate that our method outperforms sota approaches that utilize external real images.
翻译:长尾分布下的分布外(OOD)检测是一项极具挑战性的任务,因为尾类样本的稀缺性导致特征空间中的决策边界模糊。当前最先进的方法通常采用离群点暴露策略,依赖大规模真实外部数据集(例如8000万小图像)来正则化特征空间。然而,由于高昂的数据获取成本和隐私敏感性,这种对外部数据的依赖在实际部署中往往不可行。为此,我们提出了一种新颖的无数据框架,旨在完全消除对外部数据集的依赖,同时保持卓越的检测性能。我们引入了一种几何引导的虚拟离群点合成策略,该策略利用超球面上的冯·米塞斯-费舍尔分布对统计特性进行建模。具体而言,我们在特征空间中定位一个低似然环形区域,并在该区域进行虚拟离群点的定向采样。同时,我们提出了一种新的双粒度语义损失函数,利用对比学习最大化分布内特征与这些合成的边界离群点之间的区分度。在CIFAR-LT等基准数据集上的大量实验表明,我们的方法优于那些使用外部真实图像的最先进方法。