Out-of-distribution (OOD) detection is critical for the safe deployment of machine learning systems. Existing post-hoc detectors typically rely on model confidence scores or likelihood estimates in feature space, often under restrictive distributional assumptions. In this work, we introduce a third paradigm and formulate OOD detection from a diversity perspective. We propose the Vendi Novelty Score (VNS), an OOD detector based on the Vendi Scores (VS), a family of similarity-based diversity metrics. VNS quantifies how much a test sample increases the VS of the in-distribution feature set, providing a principled notion of novelty that does not require density modeling. VNS is linear-time, non-parametric, and naturally combines class-conditional (local) and dataset-level (global) novelty signals. Across multiple image classification benchmarks and network architectures, VNS achieves state-of-the-art OOD detection performance. Remarkably, VNS retains this performance when computed using only 1% of the training data, enabling deployment in memory- or access-constrained settings.
翻译:分布外(OOD)检测对于机器学习系统的安全部署至关重要。现有的后置检测器通常依赖于模型置信度分数或特征空间中的似然估计,且常基于受限的分布假设。本工作中,我们引入第三种范式,从多样性的角度构建OOD检测问题。我们提出了Vendi新颖性分数(VNS),这是一种基于Vendi分数(VS)——一个基于相似度的多样性度量族——的OOD检测器。VNS量化了一个测试样本能在多大程度上增加分布内特征集的VS,从而提供了一种无需密度建模的、基于原理的新颖性度量。VNS具有线性时间复杂度、非参数特性,并能自然地结合类条件(局部)和数据集级别(全局)的新颖性信号。在多个图像分类基准测试和网络架构上,VNS实现了最先进的OOD检测性能。值得注意的是,即使仅使用1%的训练数据进行计算,VNS仍能保持其性能,这使得它能够在内存或数据访问受限的环境中部署。