Surface defect inspection plays an important role in the process of industrial manufacture and production. Though Convolutional Neural Network (CNN) based defect inspection methods have made huge leaps, they still confront a lot of challenges such as defect scale variation, complex background, low contrast, and so on. To address these issues, we propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network. JAFFNet mainly incorporates a joint attention-guided feature fusion (JAFF) module into decoding stages to adaptively fuse low-level and high-level features. The JAFF module learns to emphasize defect features and suppress background noise during feature fusion, which is beneficial for detecting low-contrast defects. In addition, JAFFNet introduces a dense receptive field (DRF) module following the encoder to capture features with rich context information, which helps detect defects of different scales. The JAFF module mainly utilizes a learned joint channel-spatial attention map provided by high-level semantic features to guide feature fusion. The attention map makes the model pay more attention to defect features. The DRF module utilizes a sequence of multi-receptive-field (MRF) units with each taking as inputs all the preceding MRF feature maps and the original input. The obtained DRF features capture rich context information with a large range of receptive fields. Extensive experiments conducted on SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves promising performance in comparison with other state-of-the-art methods. Meanwhile, our method reaches a real-time defect detection speed of 66 FPS.
翻译:表面缺陷检测在工业制造和生产过程中扮演着重要角色。尽管基于卷积神经网络(CNN)的缺陷检测方法已取得巨大进展,但仍面临诸多挑战,如缺陷尺度变化、复杂背景、低对比度等。为解决这些问题,我们提出了一种基于编码器-解码器网络的联合注意力引导特征融合网络(JAFFNet),用于表面缺陷的显著性检测。JAFFNet主要在解码阶段引入联合注意力引导特征融合(JAFF)模块,以自适应地融合低层与高层特征。JAFF模块在特征融合过程中学习突出缺陷特征并抑制背景噪声,有利于检测低对比度缺陷。此外,JAFFNet在编码器之后引入密集感受野(DRF)模块,以捕获具有丰富上下文信息的特征,从而辅助检测不同尺度的缺陷。JAFF模块主要利用高层语义特征提供的学习型联合通道-空间注意力图来引导特征融合,该注意力图使模型更关注缺陷特征。DRF模块采用一系列多感受野(MRF)单元,每个单元将前述所有MRU特征图与原始输入作为输入。获得的DRF特征能捕获大范围感受野下的丰富上下文信息。在SD-saliency-900、磁瓦和DAGM 2007数据集上的大量实验表明,与现有最优方法相比,本方法取得了令人满意的性能。同时,本方法实现了66 FPS的实时缺陷检测速度。