Deep autoencoders provide an effective tool for learning non-linear dimensionality reduction in an unsupervised way. Recently, they have been used for the task of anomaly detection in the visual domain. By optimizing for the reconstruction error using anomaly-free examples, the common belief is that a corresponding network should fail to accurately reconstruct anomalous regions in the application phase. This goal is typically addressed by controlling the capacity of the network, either by reducing the size of the bottleneck layer or by enforcing sparsity constraints on the activations. However, neither of these techniques does explicitly penalize reconstruction of anomalous signals often resulting in poor detection. We tackle this problem by adapting a self-supervised learning regime that allows the use of discriminative information during training but focuses on the data manifold of normal examples. We emphasize that inference with our approach is very efficient during training and prediction requiring a single forward pass for each input image. Our experiments on the MVTec AD dataset demonstrate high detection and localization performance. On the texture-subset, in particular, our approach consistently outperforms recent anomaly detection methods by a significant margin.
翻译:深度自编码器提供了一种以无监督方式学习非线性降维的有效工具。近年来,它们被用于视觉领域的异常检测任务。通过使用无异常样本优化重建误差,普遍认为相应的网络在应用阶段应无法准确重建异常区域。这一目标通常通过控制网络容量实现,具体方式包括缩小瓶颈层尺寸或对激活值施加稀疏约束。然而,这些技术均未显式惩罚异常信号的重建,常导致检测效果不佳。我们通过采用自监督学习机制解决此问题,该机制在训练过程中利用判别信息,但聚焦于正常样本的数据流形。需要强调的是,我们的方法在训练和预测阶段的推理效率极高,每张输入图像仅需单次前向传播。在MVTec AD数据集上的实验表明,该方法具有卓越的检测与定位性能。特别是在纹理子集上,我们的方法始终以显著优势优于近期异常检测方法。