This paper studies zero-shot anomaly classification (AC) and segmentation (AS) in industrial vision. We reveal that the abundant normal and abnormal cues implicit in unlabeled test images can be exploited for anomaly determination, which is ignored by prior methods. Our key observation is that for the industrial product images, the normal image patches could find a relatively large number of similar patches in other unlabeled images, while the abnormal ones only have a few similar patches. We leverage such a discriminative characteristic to design a novel zero-shot AC/AS method by Mutual Scoring (MuSc) of the unlabeled images, which does not need any training or prompts. Specifically, we perform Local Neighborhood Aggregation with Multiple Degrees (LNAMD) to obtain the patch features that are capable of representing anomalies in varying sizes. Then we propose the Mutual Scoring Mechanism (MSM) to leverage the unlabeled test images to assign the anomaly score to each other. Furthermore, we present an optimization approach named Re-scoring with Constrained Image-level Neighborhood (RsCIN) for image-level anomaly classification to suppress the false positives caused by noises in normal images. The superior performance on the challenging MVTec AD and VisA datasets demonstrates the effectiveness of our approach. Compared with the state-of-the-art zero-shot approaches, MuSc achieves a $\textbf{21.1%}$ PRO absolute gain (from 72.7% to 93.8%) on MVTec AD, a $\textbf{19.4%}$ pixel-AP gain and a $\textbf{14.7%}$ pixel-AUROC gain on VisA. In addition, our zero-shot approach outperforms most of the few-shot approaches and is comparable to some one-class methods. Code is available at https://github.com/xrli-U/MuSc.
翻译:本文研究工业视觉中的零样本异常分类(AC)与异常分割(AS)问题。我们发现,未标记测试图像中隐含的丰富正常与异常线索可用于异常判定,而现有方法忽视了这一点。关键观察在于:对于工业产品图像,正常图像块能在其他未标记图像中找到相对大量的相似块,而异常图像块仅有少量相似块。我们利用这一判别性特征,通过未标记图像的互评机制(MuSc)设计了一种无需任何训练或提示的新型零样本AC/AS方法。具体而言,我们提出多尺度局部邻域聚合(LNAMD)获取能表征不同大小异常的图像块特征;继而提出互评机制(MSM),利用未标记测试图像相互分配异常分数;此外,提出一种名为带约束图像级邻域重评分(RsCIN)的优化方法用于图像级异常分类,以抑制正常图像中噪声导致的误报。在具有挑战性的MVTec AD和VisA数据集上的优异性能验证了本方法的有效性。与当前最先进的零样本方法相比,MuSc在MVTec AD上实现了$\textbf{21.1%}$的PRO绝对增益(从72.7%提升至93.8%),在VisA上实现了$\textbf{19.4%}$的像素AP增益和$\textbf{14.7%}$的像素AUROC增益。此外,我们的零样本方法优于大多数少样本方法,并与部分单分类方法性能相当。代码已开源:https://github.com/xrli-U/MuSc。