Visual anomaly detection aims to learn normality from normal images, but existing approaches are fragmented across various tasks: defect detection, semantic anomaly detection, multi-class anomaly detection, and anomaly clustering. This one-task-one-model approach is resource-intensive and incurs high maintenance costs as the number of tasks increases. We present UniFormaly, a universal and powerful anomaly detection framework. We emphasize the necessity of our off-the-shelf approach by pointing out a suboptimal issue in online encoder-based methods. We introduce Back Patch Masking (BPM) and top k-ratio feature matching to achieve unified anomaly detection. BPM eliminates irrelevant background regions using a self-attention map from self-supervised ViTs. This operates in a task-agnostic manner and alleviates memory storage consumption, scaling to tasks with large-scale datasets. Top k-ratio feature matching unifies anomaly levels and tasks by casting anomaly scoring into multiple instance learning. Finally, UniFormaly achieves outstanding results on various tasks and datasets. Codes are available at https://github.com/YoojLee/Uniformaly.
翻译:视觉异常检测旨在从正常图像中学习常态,但现有方法在缺陷检测、语义异常检测、多类异常检测以及异常聚类等不同任务中呈现碎片化态势。这种“一任务一模型”的方法不仅资源密集,且随着任务数量增加导致高昂的维护成本。我们提出UniFormaly——一个通用且强大的异常检测框架。通过指出现有在线编码器方法中的次优问题,我们强调了这种即用型方法的必要性。我们引入背景块掩蔽(BPM)和Top k-比率特征匹配以实现统一异常检测。BPM利用自监督ViT的自注意力图消除无关背景区域,以任务无关的方式运行,并减少内存存储消耗,可扩展至大规模数据集任务。Top k-比率特征匹配通过将异常评分转化为多实例学习来统一异常等级与任务类型。最终,UniFormaly在多种任务与数据集上取得了卓越成果。代码开源地址:https://github.com/YoojLee/Uniformaly。