2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields. Existing multimodal industrial anomaly detection methods directly concatenate the multimodal features, which leads to a strong disturbance between features and harms the detection performance. In this paper, we propose Multi-3D-Memory (M3DM), a novel multimodal anomaly detection method with hybrid fusion scheme: firstly, we design an unsupervised feature fusion with patch-wise contrastive learning to encourage the interaction of different modal features; secondly, we use a decision layer fusion with multiple memory banks to avoid loss of information and additional novelty classifiers to make the final decision. We further propose a point feature alignment operation to better align the point cloud and RGB features. Extensive experiments show that our multimodal industrial anomaly detection model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTec-3D AD dataset. Code is available at https://github.com/nomewang/M3DM.
翻译:基于2D的工业异常检测已被广泛讨论,然而基于3D点云和RGB图像的多模态工业异常检测仍有许多未探索领域。现有的多模态工业异常检测方法直接拼接多模态特征,导致特征间存在强烈干扰,损害了检测性能。本文提出Multi-3D-Memory(M3DM),一种采用混合融合方案的新型多模态异常检测方法:首先,我们设计了一种基于patch级对比学习的无监督特征融合,以促进不同模态特征间的交互;其次,我们采用基于多存储库的决策层融合以避免信息损失,并引入额外的异常分类器做出最终决策。我们进一步提出点特征对齐操作,以更好地对齐点云与RGB特征。大量实验表明,我们的多模态工业异常检测模型在MVTec-3D AD数据集上的检测与分割精度均超越了当前最优(SOTA)方法。代码已开源:https://github.com/nomewang/M3DM。