Object anomaly detection is an important problem in the field of machine vision and has seen remarkable progress recently. However, two significant challenges hinder its research and application. First, existing datasets lack comprehensive visual information from various pose angles. They usually have an unrealistic assumption that the anomaly-free training dataset is pose-aligned, and the testing samples have the same pose as the training data. However, in practice, anomaly may exist in any regions on a object, the training and query samples may have different poses, calling for the study on pose-agnostic anomaly detection. Second, the absence of a consensus on experimental protocols for pose-agnostic anomaly detection leads to unfair comparisons of different methods, hindering the research on pose-agnostic anomaly detection. To address these issues, we develop Multi-pose Anomaly Detection (MAD) dataset and Pose-agnostic Anomaly Detection (PAD) benchmark, which takes the first step to address the pose-agnostic anomaly detection problem. Specifically, we build MAD using 20 complex-shaped LEGO toys including 4K views with various poses, and high-quality and diverse 3D anomalies in both simulated and real environments. Additionally, we propose a novel method OmniposeAD, trained using MAD, specifically designed for pose-agnostic anomaly detection. Through comprehensive evaluations, we demonstrate the relevance of our dataset and method. Furthermore, we provide an open-source benchmark library, including dataset and baseline methods that cover 8 anomaly detection paradigms, to facilitate future research and application in this domain. Code, data, and models are publicly available at https://github.com/EricLee0224/PAD.
翻译:物体异常检测是机器视觉领域的重要问题,近年来取得了显著进展。然而,两大挑战阻碍了其研究与应用。首先,现有数据集缺乏多视角姿态的全面视觉信息。它们通常持有不现实的假设:无异常训练数据集是姿态对齐的,且测试样本与训练数据具有相同姿态。但在实际应用中,异常可能出现在物体的任意区域,训练样本与查询样本可能具有不同姿态,这亟需研究姿态无关的异常检测。其次,由于缺乏姿态无关异常检测的实验协议共识,不同方法的比较标准不统一,阻碍了该领域的研究进展。为解决这些问题,我们构建了多姿态异常检测(MAD)数据集与姿态无关异常检测(PAD)基准,率先迈出了解决姿态无关异常检测问题的第一步。具体而言,我们利用20个复杂形状的乐高玩具构建MAD,包含4K视角的多姿态图像,并在模拟与真实环境中引入高质量、多样化的3D异常。此外,我们提出了一种基于MAD训练的新方法OmniposeAD,专用于姿态无关异常检测。通过全面评估,我们验证了数据集与方法的有效性。进一步地,我们提供了开源基准库,涵盖数据集及覆盖8种异常检测范式的基线方法,以推动该领域的未来研究与应用。代码、数据和模型已公开于https://github.com/EricLee0224/PAD。