Detecting out-of-distribution (OOD) data is a critical challenge in machine learning due to model overconfidence, often without awareness of their epistemological limits. We hypothesize that ``neural collapse'', a phenomenon affecting in-distribution data for models trained beyond loss convergence, also influences OOD data. To benefit from this interplay, we introduce NECO, a novel post-hoc method for OOD detection, which leverages the geometric properties of ``neural collapse'' and of principal component spaces to identify OOD data. Our extensive experiments demonstrate that NECO achieves state-of-the-art results on both small and large-scale OOD detection tasks while exhibiting strong generalization capabilities across different network architectures. Furthermore, we provide a theoretical explanation for the effectiveness of our method in OOD detection. We plan to release the code after the anonymity period.
翻译:检测分布外数据是机器学习中的关键挑战,其原因在于模型常因缺乏对自身认知边界的意识而表现出过度自信。我们假设,影响已训练至损失收敛模型的分布内数据的"神经坍缩"现象,同样会对分布外数据产生影响。为利用这一相互作用,我们提出NECO——一种新颖的分布外检测后处理方法,该方法借助"神经坍缩"与主成分空间的几何特性来识别分布外数据。大量实验表明,NECO在小型及大规模分布外检测任务中均达到当前最优性能,并展现出跨不同网络架构的强泛化能力。此外,我们从理论层面阐释了该方法在分布外检测中的有效性。我们计划在匿名期结束后公开代码。