3D anomaly detection is an emerging and vital computer vision task in industrial manufacturing (IM). Recently many advanced algorithms have been published, but most of them cannot meet the needs of IM. There are several disadvantages: i) difficult to deploy on production lines since their algorithms heavily rely on large pre-trained models; ii) hugely increase storage overhead due to overuse of memory banks; iii) the inference speed cannot be achieved in real-time. To overcome these issues, we propose an easy and deployment-friendly network (called EasyNet) without using pre-trained models and memory banks: firstly, we design a multi-scale multi-modality feature encoder-decoder to accurately reconstruct the segmentation maps of anomalous regions and encourage the interaction between RGB images and depth images; secondly, we adopt a multi-modality anomaly segmentation network to achieve a precise anomaly map; thirdly, we propose an attention-based information entropy fusion module for feature fusion during inference, making it suitable for real-time deployment. Extensive experiments show that EasyNet achieves an anomaly detection AUROC of 92.6% without using pre-trained models and memory banks. In addition, EasyNet is faster than existing methods, with a high frame rate of 94.55 FPS on a Tesla V100 GPU.
翻译:三维异常检测是工业制造领域一项新兴且重要的计算机视觉任务。近年来虽涌现众多先进算法,但多数难以满足工业制造需求:其一,算法严重依赖大型预训练模型导致产线部署困难;其二,过度使用记忆库大幅增加存储开销;其三,推理速度无法达到实时要求。为解决上述问题,我们提出一种无需预训练模型与记忆库的轻量高部署性网络(命名为EasyNet):首先,设计多尺度多模态特征编码-解码器,实现对异常区域分割图的精准重建,并促进RGB图像与深度图像的交互;其次,采用多模态异常分割网络获取精细异常图;最后,提出基于注意力机制的信息熵融合模块用于推理阶段特征融合,使其适合实时部署。大量实验表明,EasyNet在不使用预训练模型与记忆库的情况下达到92.6%的异常检测AUROC值。此外,该网络在Tesla V100 GPU上以94.55 FPS的高帧率运行,速度优于现有方法。