AnomalyDINO: Boosting Patch-based Few-shot Anomaly Detection with DINOv2

Recent advances in multimodal foundation models have set new standards in few-shot anomaly detection. This paper explores whether high-quality visual features alone are sufficient to rival existing state-of-the-art vision-language models. We affirm this by adapting DINOv2 for one-shot and few-shot anomaly detection, with a focus on industrial applications. We show that this approach does not only rival existing techniques but can even outmatch them in many settings. Our proposed vision-only approach, AnomalyDINO, is based on patch similarities and enables both image-level anomaly prediction and pixel-level anomaly segmentation. The approach is methodologically simple and training-free and, thus, does not require any additional data for fine-tuning or meta-learning. Despite its simplicity, AnomalyDINO achieves state-of-the-art results in one- and few-shot anomaly detection (e.g., pushing the one-shot performance on MVTec-AD from an AUROC of 93.1% to 96.6%). The reduced overhead, coupled with its outstanding few-shot performance, makes AnomalyDINO a strong candidate for fast deployment, for example, in industrial contexts.

翻译：近年来，多模态基础模型在少样本异常检测领域取得了突破性进展。本文探讨了仅凭高质量视觉特征是否足以媲美现有的先进视觉-语言模型。我们通过将DINOv2适配于单样本和少样本异常检测任务（重点关注工业应用场景）对此问题给出了肯定答案。研究表明，该方法不仅能够与现有技术相抗衡，甚至在多数设定下能实现更优性能。我们提出的纯视觉方法AnomalyDINO基于图像块相似度计算，可同时实现图像级异常预测与像素级异常分割。该方法原理简洁且无需训练，因此不需要任何额外数据进行微调或元学习。尽管设计简单，AnomalyDINO在单样本和少样本异常检测中均达到了最先进水平（例如将MVTec-AD数据集的单样本检测AUROC从93.1%提升至96.6%）。较低的计算开销与卓越的少样本性能相结合，使AnomalyDINO成为快速部署（如工业场景）的理想选择。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日