We propose a simple modification to standard deep learning architectures during their training phase--L2 normalization over feature space--that produces results competitive with state-of-the-art Out-of-Distribution (OoD) detection but with relatively little training time. When L2 normalization is removed at test time, magnitudes of feature vectors becomes a surprisingly good measurement for OoD detection. Intuitively, In Distribution (ID) images result in large vectors, while OoD images have small magnitudes, which permits a simple threshold scheme for screen OoD images. We provide a theoretical analysis of how this simple change works. Competitive results are possible in only 60 epochs of training on a standard ResNet18.
翻译:我们提出一种对标准深度学习架构训练阶段的简单修改——在特征空间上应用L2归一化,该方法能以相对较短的训练时间取得与最先进的分布外检测相媲美的结果。在测试阶段移除L2归一化后,特征向量的模长成为分布外检测的一个惊人有效度量。直观而言,分布内图像会生成较大模长的向量,而分布外图像的向量模长较小,这使得简单的阈值方案即可筛选分布外图像。我们提供了该简单修改背后工作原理的理论分析。仅需在标准ResNet18上训练60个轮次,即可获得具有竞争力的结果。