Detection of out-of-distribution (OOD) samples is crucial for safe real-world deployment of machine learning models. Recent advances in vision language foundation models have made them capable of detecting OOD samples without requiring in-distribution (ID) images. However, these zero-shot methods often underperform as they do not adequately consider ID class likelihoods in their detection confidence scoring. Hence, we introduce CLIPScope, a zero-shot OOD detection approach that normalizes the confidence score of a sample by class likelihoods, akin to a Bayesian posterior update. Furthermore, CLIPScope incorporates a novel strategy to mine OOD classes from a large lexical database. It selects class labels that are farthest and nearest to ID classes in terms of CLIP embedding distance to maximize coverage of OOD samples. We conduct extensive ablation studies and empirical evaluations, demonstrating state of the art performance of CLIPScope across various OOD detection benchmarks.
翻译:分布外(OOD)样本检测对于机器学习模型在现实世界中的安全部署至关重要。视觉语言基础模型的最新进展使其能够在无需分布内(ID)图像的情况下检测OOD样本。然而,这些零样本方法在检测置信度评分中未能充分考虑ID类别的似然性,导致性能往往欠佳。为此,我们提出CLIPScope——一种通过类别似然对样本置信度评分进行归一化的零样本OOD检测方法,其原理类似于贝叶斯后验更新。此外,CLIPScope引入了一种从大规模词汇数据库中挖掘OOD类别的创新策略:通过选择在CLIP嵌入空间中与ID类别距离最远和最近的类别标签,以最大化对OOD样本的覆盖范围。我们进行了广泛的消融实验与实证评估,结果表明CLIPScope在多种OOD检测基准测试中均达到了最先进的性能水平。