Deep autoencoders provide an effective tool for learning non-linear dimensionality reduction in an unsupervised way. Recently, they have been used for the task of anomaly detection in the visual domain. By optimizing for the reconstruction error using anomaly-free examples, the common belief is that a corresponding network should fail to accurately reconstruct anomalous regions in the application phase. This goal is typically addressed by controlling the capacity of the network, either by reducing the size of the bottleneck layer or by enforcing sparsity constraints on the activations. However, neither of these techniques does explicitly penalize reconstruction of anomalous signals often resulting in poor detection. We tackle this problem by adapting a self-supervised learning regime that allows the use of discriminative information during training but focuses on the data manifold of normal examples. We emphasize that inference with our approach is very efficient during training and prediction requiring a single forward pass for each input image. Our experiments on the MVTec AD dataset demonstrate high detection and localization performance. On the texture-subset, in particular, our approach consistently outperforms recent anomaly detection methods by a significant margin.
翻译:深度自编码器为无监督学习非线性降维提供了有效工具。近年来,该类模型已被应用于视觉领域的异常检测任务。通过利用无异常样本优化重建误差,学界普遍认为对应网络将在应用阶段无法准确重建异常区域。通常采用控制网络容量的方法实现这一目标——或减小瓶颈层尺寸,或对激活值施加稀疏约束。然而,这两类技术均未显式约束异常信号的重建,常导致检测效果不佳。我们通过适配自监督学习框架解决该问题,该框架在训练过程中可利用判别性信息,但聚焦于正常样本的数据流形。需要强调,我们方法的推理过程在训练和预测阶段均极为高效,仅需对每张输入图像执行单次前向传播。在MVTec AD数据集上的实验表明,该方法具有优异的检测与定位性能。特别是在纹理子集上,我们的方法以显著优势持续超越近期提出的异常检测方法。