Automatic anomaly detection based on visual cues holds practical significance in various domains, such as manufacturing and product quality assessment. This paper introduces a new conditional anomaly detection problem, which involves identifying anomalies in a query image by comparing it to a reference shape. To address this challenge, we have created a large dataset, BrokenChairs-180K, consisting of around 180K images, with diverse anomalies, geometries, and textures paired with 8,143 reference 3D shapes. To tackle this task, we have proposed a novel transformer-based approach that explicitly learns the correspondence between the query image and reference 3D shape via feature alignment and leverages a customized attention mechanism for anomaly detection. Our approach has been rigorously evaluated through comprehensive experiments, serving as a benchmark for future research in this domain.
翻译:基于视觉线索的自动异常检测在制造与产品质量评估等多个领域具有实际意义。本文提出一种新的条件式异常检测问题,其核心在于通过将查询图像与参考三维形状进行比对来识别异常。为应对这一挑战,我们构建了大规模数据集BrokenChairs-180K,包含约18万张图像,涵盖多样化的异常类型、几何结构与纹理特征,并与8,143个参考三维形状配对。针对该任务,我们提出一种基于Transformer的新方法,通过特征对齐显式学习查询图像与参考三维形状的对应关系,并利用定制化的注意力机制进行异常检测。通过系统性的实验评估,我们的方法为该领域后续研究建立了基准。