3D anomaly detection is an emerging and vital computer vision task in industrial manufacturing (IM). Recently many advanced algorithms have been published, but most of them cannot meet the needs of IM. There are several disadvantages: i) difficult to deploy on production lines since their algorithms heavily rely on large pre-trained models; ii) hugely increase storage overhead due to overuse of memory banks; iii) the inference speed cannot be achieved in real-time. To overcome these issues, we propose an easy and deployment-friendly network (called EasyNet) without using pre-trained models and memory banks: firstly, we design a multi-scale multi-modality feature encoder-decoder to accurately reconstruct the segmentation maps of anomalous regions and encourage the interaction between RGB images and depth images; secondly, we adopt a multi-modality anomaly segmentation network to achieve a precise anomaly map; thirdly, we propose an attention-based information entropy fusion module for feature fusion during inference, making it suitable for real-time deployment. Extensive experiments show that EasyNet achieves an anomaly detection AUROC of 92.6% without using pre-trained models and memory banks. In addition, EasyNet is faster than existing methods, with a high frame rate of 94.55 FPS on a Tesla V100 GPU.
翻译:三维异常检测是工业制造领域一项新兴且至关重要的计算机视觉任务。近年来虽然涌现出许多先进算法,但大多数难以满足实际工业生产需求,主要存在以下不足:i) 算法严重依赖大型预训练模型,导致在产线上部署困难;ii) 过度使用存储库造成存储开销急剧增加;iii) 推理速度无法达到实时要求。针对上述问题,本文提出一种无需预训练模型和存储库的轻量级易部署网络(EasyNet):首先,设计多尺度多模态特征编码-解码器,精确重构异常区域的分割图,并促进RGB图像与深度图像的特征交互;其次,采用多模态异常分割网络生成精准的异常图;最后,提出基于注意力机制的信息熵融合模块,在推理阶段实现特征融合,从而满足实时部署需求。大量实验表明,EasyNet在未使用预训练模型和存储库的情况下,异常检测AUROC达92.6%。此外,其推理速度超越现有方法,在Tesla V100 GPU上达到94.55 FPS的高帧率。