In recent years, performance on existing anomaly detection benchmarks like MVTec AD and VisA has started to saturate in terms of segmentation AU-PRO, with state-of-the-art models often competing in the range of less than one percentage point. This lack of discriminatory power prevents a meaningful comparison of models and thus hinders progress of the field, especially when considering the inherent stochastic nature of machine learning results. We present MVTec AD 2, a collection of eight anomaly detection scenarios with more than 8000 high-resolution images. It comprises challenging and highly relevant industrial inspection use cases that have not been considered in previous datasets, including transparent and overlapping objects, dark-field and back light illumination, objects with high variance in the normal data, and extremely small defects. We provide comprehensive evaluations of state-of-the-art methods and show that their performance remains below 60% average AU-PRO. Additionally, our dataset provides test scenarios with lighting condition changes to assess the robustness of methods under real-world distribution shifts. We host a publicly accessible evaluation server that holds the pixel-precise ground truth of the test set (https://benchmark.mvtec.com/). All image data is available at https://www.mvtec.com/company/research/datasets/mvtec-ad-2.
翻译:近年来,在现有异常检测基准(如 MVTec AD 和 VisA)上,模型在分割AU-PRO指标上的性能已开始趋于饱和,最先进的模型通常仅在不到一个百分点的范围内竞争。这种区分能力的缺乏阻碍了对模型进行有意义的比较,从而制约了该领域的发展,尤其是在考虑到机器学习结果固有的随机性时。我们提出了 MVTec AD 2,这是一个包含八个异常检测场景、超过8000张高分辨率图像的数据集。它涵盖了先前数据集中未考虑的、具有挑战性且高度相关的工业检测用例,包括透明与重叠物体、暗场与背光照明、正常数据方差极高的物体以及极其微小的缺陷。我们对最先进的方法进行了全面评估,结果表明其平均AU-PRO性能仍低于60%。此外,我们的数据集提供了光照条件变化的测试场景,以评估方法在真实世界分布偏移下的鲁棒性。我们托管了一个公开可访问的评估服务器,其中包含测试集的像素级标注真值(https://benchmark.mvtec.com/)。所有图像数据可在 https://www.mvtec.com/company/research/datasets/mvtec-ad-2 获取。