The key to out-of-distribution detection is density estimation of the in-distribution data or of its feature representations. This is particularly challenging for dense anomaly detection in domains where the in-distribution data has a complex underlying structure. Nearest-Neighbors approaches have been shown to work well in object-centric data domains, such as industrial inspection and image classification. In this paper, we show that nearest-neighbor approaches also yield state-of-the-art results on dense novelty detection in complex driving scenes when working with an appropriate feature representation. In particular, we find that transformer-based architectures produce representations that yield much better similarity metrics for the task. We identify the multi-head structure of these models as one of the reasons, and demonstrate a way to transfer some of the improvements to CNNs. Ultimately, the approach is simple and non-invasive, i.e., it does not affect the primary segmentation performance, refrains from training on examples of anomalies, and achieves state-of-the-art results on RoadAnomaly, StreetHazards, and SegmentMeIfYouCan-Anomaly.
翻译:分布外检测的关键在于对分布内数据或其特征表示进行密度估计。当分布内数据具有复杂的潜在结构时,这一任务在密集异常检测领域尤为具有挑战性。最近邻方法已在以物体为中心的数据领域(如工业检测和图像分类)中展现出良好效果。本文证明,当采用适当的特征表示时,最近邻方法同样能在复杂驾驶场景的密集新奇性检测中取得最先进成果。具体而言,我们发现基于Transformer的架构能够生成更优的任务相似性度量指标。我们识别出此类模型的多头结构是原因之一,并展示了将部分改进迁移至CNN的方法。最终,该方法简单且无侵入性,即不影响主要分割性能、无需基于异常样本训练,并在RoadAnomaly、StreetHazards和SegmentMeIfYouCan-Anomaly数据集上达到了最先进水平。