Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation

The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples. DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization. The core employs a multiscale depth-separable convolution block, capturing localized patterns across scales. This block is complemented by a spatial-channel squeeze and excitation (scSE) attention unit, modeling inter-dependencies between channels and spatial regions in feature maps. Additionally, additive attention gates refine segmentation by connecting encoder-decoder pathways. To augment the model, engineered features using Gabor filters for textural analysis, Sobel and Canny filters for edge detection are injected guided by semantic masks to expand the feature space strategically. Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities. Ablation studies highlight incremental benefits from attention blocks and feature injection. DAU-FI Net achieves state-of-the-art mean Intersection over Union (IoU) of 95.6% and 98.8% on the defect test set and benchmark respectively, surpassing prior methods by 8.9% and 12.6%, respectively. Ablation studies highlight incremental benefits from attention blocks and feature injection. The proposed architecture provides a robust solution, advancing semantic segmentation for multiclass problems with limited training data. Our sewer-culvert defects dataset, featuring pixel-level annotations, opens avenues for further research in this crucial domain. Overall, this work delivers key innovations in architecture, attention, and feature engineering to elevate semantic segmentation efficacy.

翻译：本文提出一种名为"特征注入双注意力U-Net"(DAU-FI Net)的新型架构，旨在解决语义分割中特别是样本有限的多类不平衡数据集面临的挑战。DAU-FI Net通过集成多尺度空间-通道注意力机制与特征注入技术，提升了目标定位的精确度。其核心采用多尺度深度可分离卷积模块，能够跨尺度捕捉局部模式；该模块辅以空间-通道挤压与激励(scSE)注意力单元，建模特征图中通道与空间区域间的相互依赖关系。此外，通过加法注意力门连接编码器-解码器路径，进一步优化分割效果。为增强模型性能，研究利用Gabor滤波器进行纹理分析、Sobel与Canny滤波器进行边缘检测，并通过语义掩码引导注入工程化特征，策略性地扩展特征空间。在具有挑战性的下水管道与涵洞缺陷数据集及基准数据集上的综合实验验证了DAU-FI Net的能力。消融研究突显了注意力模块与特征注入的增量效益。DAU-FI Net在缺陷测试集和基准数据集上分别实现了95.6%和98.8%的先进平均交并比(mIoU)，较先前方法分别提升8.9%和12.6%。所提出的架构为训练数据有限的多类问题提供了鲁棒的语义分割解决方案。本研究提供的下水管道-涵洞缺陷数据集具备像素级标注，为该关键领域的进一步研究开辟了新途径。总体而言，本文在架构设计、注意力机制及特征工程方面实现了关键创新，显著提升了语义分割的效能。