3D anomaly detection is an emerging and vital computer vision task in industrial manufacturing (IM). Recently many advanced algorithms have been published, but most of them cannot meet the needs of IM. There are several disadvantages: i) difficult to deploy on production lines since their algorithms heavily rely on large pre-trained models; ii) hugely increase storage overhead due to overuse of memory banks; iii) the inference speed cannot be achieved in real-time. To overcome these issues, we propose an easy and deployment-friendly network (called EasyNet) without using pre-trained models and memory banks: firstly, we design a multi-scale multi-modality feature encoder-decoder to accurately reconstruct the segmentation maps of anomalous regions and encourage the interaction between RGB images and depth images; secondly, we adopt a multi-modality anomaly segmentation network to achieve a precise anomaly map; thirdly, we propose an attention-based information entropy fusion module for feature fusion during inference, making it suitable for real-time deployment. Extensive experiments show that EasyNet achieves an anomaly detection AUROC of 92.6% without using pre-trained models and memory banks. In addition, EasyNet is faster than existing methods, with a high frame rate of 94.55 FPS on a Tesla V100 GPU.
翻译:3D异常检测是工业制造中一项新兴且至关重要的计算机视觉任务。近年来,许多先进算法已被发表,但大多数无法满足工业制造的需求。存在以下缺点:i) 算法严重依赖大型预训练模型,难以部署在生产线上;ii) 过度使用存储库导致存储开销显著增加;iii) 推理速度无法达到实时要求。为克服这些问题,本文提出一种无需预训练模型和存储库的简易且利于部署的网络(称为EasyNet):首先,我们设计一个多尺度多模态特征编码器-解码器,以精确重建异常区域的分割图,并促进RGB图像与深度图像间的交互;其次,采用多模态异常分割网络实现精确的异常图;再次,提出一种基于注意力机制的信息熵融合模块,用于推理过程中的特征融合,使其适用于实时部署。大量实验表明,EasyNet在未使用预训练模型和存储库的情况下,实现了92.6%的异常检测AUROC。此外,EasyNet的推理速度优于现有方法,在Tesla V100 GPU上帧率高达94.55 FPS。