Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results could be achieved if the best layer were picked. To leverage this observation, we propose a data-driven, unsupervised method to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a greater number of classes (up to 77), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results while removing manual feature selection altogether. Their performance achieves near oracle's best layer performance.
翻译:分布外(OOD)检测是一个快速发展的领域,其发展动力源于基于人工智能系统的增多所带来的新的鲁棒性和安全性需求。现有的文本OOD检测器通常依赖于在编码器最后一层输出的嵌入表示上计算的异常分数(例如马氏距离)。本研究观察到,OOD检测性能会因任务和层输出的不同而产生显著差异。更重要的是,我们发现通常选择的最后一层很少是OOD检测的最佳选择,如果选择最优的中间层,则可以取得更好的结果。基于这一观察,我们提出了一种数据驱动的无监督方法来聚合逐层异常分数。此外,我们通过引入包含更多类别(最多77类)的分类任务扩展了经典文本OOD基准测试,这更符合实际应用场景。在这个扩充的基准测试上,我们证明所提出的后聚合方法能够在完全去除手动特征选择的同时,获得稳健且一致的结果,其性能接近Oracle最优层性能。