This paper introduces a universal approach to seamlessly combine out-of-distribution (OOD) detection scores. These scores encompass a wide range of techniques that leverage the self-confidence of deep learning models and the anomalous behavior of features in the latent space. Not surprisingly, combining such a varied population using simple statistics proves inadequate. To overcome this challenge, we propose a quantile normalization to map these scores into p-values, effectively framing the problem into a multi-variate hypothesis test. Then, we combine these tests using established meta-analysis tools, resulting in a more effective detector with consolidated decision boundaries. Furthermore, we create a probabilistic interpretable criterion by mapping the final statistics into a distribution with known parameters. Through empirical investigation, we explore different types of shifts, each exerting varying degrees of impact on data. Our results demonstrate that our approach significantly improves overall robustness and performance across diverse OOD detection scenarios. Notably, our framework is easily extensible for future developments in detection scores and stands as the first to combine decision boundaries in this context. The code and artifacts associated with this work are publicly available\footnote{\url{https://github.com/edadaltocg/detectors}}.
翻译:本文提出了一种通用方法,用于无缝整合分布外(OOD)检测分数。这些分数涵盖了广泛的技术,包括利用深度学习模型自身置信度的方法以及潜在空间中特征异常行为的方法。不出所料,使用简单统计量来整合如此多样化的分数集合被证明是不充分的。为克服这一挑战,我们提出采用分位数归一化将这些分数映射为p值,从而将问题有效转化为多变量假设检验。随后,我们运用成熟的元分析工具整合这些检验,构建出具有统一决策边界的更有效检测器。此外,通过将最终统计量映射到已知参数分布,我们建立了具有概率可解释性的判定准则。通过实证研究,我们探索了不同类型的数据偏移,每种偏移对数据产生不同程度的影响。实验结果表明,我们的方法在多种OOD检测场景中显著提升了整体鲁棒性与性能。值得注意的是,该框架可轻松扩展以适应未来检测分数的发展,并且是首个在此背景下实现决策边界整合的框架。本工作的代码及相关资料已公开\footnote{\url{https://github.com/edadaltocg/detectors}}。