In construction quality monitoring, accurately detecting and segmenting cracks in concrete structures is paramount for safety and maintenance. Current convolutional neural networks (CNNs) have demonstrated strong performance in crack segmentation tasks, yet they often struggle with complex backgrounds and fail to capture fine-grained tubular structures fully. In contrast, Transformers excel at capturing global context but lack precision in detailed feature extraction. We introduce DSCformer, a novel hybrid model that integrates an enhanced Dynamic Snake Convolution (DSConv) with a Transformer architecture for crack segmentation to address these challenges. Our key contributions include the enhanced DSConv through a pyramid kernel for adaptive offset computation and a simultaneous bi-directional learnable offset iteration, significantly improving the model's performance to capture intricate crack patterns. Additionally, we propose a Weighted Convolutional Attention Module (WCAM), which refines channel attention, allowing for more precise and adaptive feature attention. We evaluate DSCformer on the Crack3238 and FIND datasets, achieving IoUs of 59.22\% and 87.24\%, respectively. The experimental results suggest that our DSCformer outperforms state-of-the-art methods across different datasets.
翻译:在建筑工程质量监测中,准确检测和分割混凝土结构中的裂缝对于安全与维护至关重要。当前卷积神经网络(CNN)在裂缝分割任务中已展现出强大性能,但它们通常难以处理复杂背景,且无法充分捕捉细粒度的管状结构。相比之下,Transformer擅长捕获全局上下文,但在细节特征提取方面缺乏精度。为解决这些挑战,我们提出了DSCformer,一种新颖的混合模型,它将增强型动态蛇形卷积(DSConv)与Transformer架构相结合用于裂缝分割。我们的主要贡献包括:通过金字塔核进行自适应偏移计算的增强型DSConv,以及同步双向可学习偏移迭代,显著提升了模型捕捉复杂裂缝模式的能力。此外,我们提出了加权卷积注意力模块(WCAM),该模块优化了通道注意力机制,实现了更精确和自适应的特征注意力分配。我们在Crack3238和FIND数据集上评估了DSCformer,分别取得了59.22%和87.24%的交并比(IoU)。实验结果表明,我们的DSCformer在不同数据集上均优于现有最先进方法。