Detecting anomalies in images is an important task, especially in real-time computer vision applications. In this work, we focus on computational efficiency and propose a lightweight feature extractor that processes an image in less than a millisecond on a modern GPU. We then use a student-teacher approach to detect anomalous features. We train a student network to predict the extracted features of normal, i.e., anomaly-free training images. The detection of anomalies at test time is enabled by the student failing to predict their features. We propose a training loss that hinders the student from imitating the teacher feature extractor beyond the normal images. It allows us to drastically reduce the computational cost of the student-teacher model, while improving the detection of anomalous features. We furthermore address the detection of challenging logical anomalies that involve invalid combinations of normal local features, for example, a wrong ordering of objects. We detect these anomalies by efficiently incorporating an autoencoder that analyzes images globally. We evaluate our method, called EfficientAD, on 32 datasets from three industrial anomaly detection dataset collections. EfficientAD sets new standards for both the detection and the localization of anomalies. At a latency of two milliseconds and a throughput of six hundred images per second, it enables a fast handling of anomalies. Together with its low error rate, this makes it an economical solution for real-world applications and a fruitful basis for future research.
翻译:图像异常检测是计算机视觉领域中的重要任务,尤其在实时应用中具有关键作用。本研究聚焦于计算效率,提出一种轻量级特征提取器,可在现代GPU上以亚毫秒级速度处理单张图像。我们采用师生网络框架进行异常特征检测:训练学生网络预测正常(即无异常)训练图像的提取特征,测试时通过学生网络的特征预测失败实现异常检测。我们提出一种新型训练损失函数,约束学生网络不模仿教师特征提取器在正常图像范围之外的行为,从而在提升异常特征检测能力的同时大幅降低师生模型的计算成本。针对涉及局部正常特征异常组合(如物体顺序错乱)的逻辑异常检测挑战,我们通过高效集成全局图像分析的自动编码器进行解决。在来自三个工业异常检测数据集集合的32个数据集上,我们对所提出的EfficientAD方法进行了评估。实验表明,EfficientAD在异常检测与定位任务中均树立了新的性能标杆:在2毫秒延迟和每秒600张图像吞吐量条件下实现快速异常处理,结合其低错误率特性,为实际应用提供了经济高效的解决方案,并为未来研究奠定了重要基础。