Anomaly detection requires detecting abnormal samples in large unlabeled datasets. While progress in deep learning and the advent of foundation models has produced powerful zero-shot anomaly detection methods, their deployment in practice is often hindered by the lack of labeled data -- without it, their detection performance cannot be evaluated reliably. In this work, we propose SWSA (Selection With Synthetic Anomalies): a general-purpose framework to select image-based anomaly detectors with a generated synthetic validation set. Our proposed anomaly generation method assumes access to only a small support set of normal images and requires no training or fine-tuning. Once generated, our synthetic validation set is used to create detection tasks that compose a validation framework for model selection. In an empirical study, we find that SWSA often selects models that match selections made with a ground-truth validation set, resulting in higher AUROCs than baseline methods. We also find that SWSA selects prompts for CLIP-based anomaly detection that outperform baseline prompt selection strategies on all datasets, including the challenging MVTec-AD and VisA datasets.
翻译:异常检测需要在大规模无标注数据集中检测异常样本。尽管深度学习的发展与基础模型的问世催生了强大的零样本异常检测方法,但缺乏标注数据往往阻碍了其实际部署——没有这些数据,异常检测性能便无法得到可靠评估。本研究提出SWSA(基于合成异常的模型选择)方法:一种利用生成合成验证集进行图像异常检测模型选择的通用框架。我们提出的异常生成方法仅需访问少量正常图像的支持集,无需训练或微调。生成的合成验证集可用于构建检测任务,形成模型选择的验证框架。实证研究表明,SWSA选取的模型通常与基于真实验证集的选择结果一致,在AUROC指标上优于基线方法。我们还发现,SWSA为基于CLIP的异常检测选择的提示词在所有数据集(包括具有挑战性的MVTec-AD和VisA数据集)上均优于基线提示词选择策略。