Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results could be achieved if the best layer were picked. To leverage this observation, we propose a data-driven, unsupervised method to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a greater number of classes (up to 77), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results while removing manual feature selection altogether. Their performance achieves near oracle's best layer performance.
翻译:由于人工智能系统的广泛应用带来的稳健性与安全性新需求,分布外(OOD)检测正成为一个快速发展的领域。现有文本OOD检测器通常依赖基于编码器最后一层嵌入输出计算的异常分数(如马氏距离)。本研究中,我们观察到OOD检测性能在不同任务和层输出间存在显著差异。更重要的是,我们证明常规选择的最后一层很少是OOD检测的最优选择,而若选取最佳层则可取得远优结果。为利用这一发现,我们提出了一种数据驱动的无监督方法以组合层式异常分数。此外,我们通过引入包含更多类别(最多77类)的分类任务扩展了经典文本OOD基准测试,从而更贴近实际场景。在扩展基准测试上,我们证明所提出的后聚合方法能在完全消除人工特征选择的同时取得稳健一致的结果,其性能接近理论最优的层性能水平。