Surface defect inspection is of great importance for industrial manufacture and production. Though defect inspection methods based on deep learning have made significant progress, there are still some challenges for these methods, such as indistinguishable weak defects and defect-like interference in the background. To address these issues, we propose a transformer network with multi-stage CNN (Convolutional Neural Network) feature injection for surface defect segmentation, which is a UNet-like structure named CINFormer. CINFormer presents a simple yet effective feature integration mechanism that injects the multi-level CNN features of the input image into different stages of the transformer network in the encoder. This can maintain the merit of CNN capturing detailed features and that of transformer depressing noises in the background, which facilitates accurate defect detection. In addition, CINFormer presents a Top-K self-attention module to focus on tokens with more important information about the defects, so as to further reduce the impact of the redundant background. Extensive experiments conducted on the surface defect datasets DAGM 2007, Magnetic tile, and NEU show that the proposed CINFormer achieves state-of-the-art performance in defect detection.
翻译:表面缺陷检测对工业制造与生产至关重要。尽管基于深度学习的缺陷检测方法已取得显著进展,但这类方法仍面临弱缺陷难以区分、背景中存在类缺陷干扰等挑战。为解决这些问题,我们提出了一种融合多阶段CNN(卷积神经网络)特征的变压器网络用于表面缺陷分割,该网络采用类UNet结构,命名为CINFormer。CINFormer提出了一种简单高效的特征融合机制,将输入图像的多级CNN特征注入编码器中变压器网络的不同阶段。这既能保持CNN捕捉细节特征的优势,又能发挥变压器抑制背景噪声的特性,从而促进精确的缺陷检测。此外,CINFormer设计了Top-K自注意力模块,聚焦于携带更关键缺陷信息的令牌,进一步降低冗余背景的影响。在DAGM 2007、磁瓦和NEU表面缺陷数据集上的大量实验表明,所提出的CINFormer在缺陷检测方面达到了最先进性能。