A Computer Vision Enabled damage detection model with improved YOLOv5 based on Transformer Prediction Head

Objective:Computer vision-based up-to-date accurate damage classification and localization are of decisive importance for infrastructure monitoring, safety, and the serviceability of civil infrastructure. Current state-of-the-art deep learning (DL)-based damage detection models, however, often lack superior feature extraction capability in complex and noisy environments, limiting the development of accurate and reliable object distinction. Method: To this end, we present DenseSPH-YOLOv5, a real-time DL-based high-performance damage detection model where DenseNet blocks have been integrated with the backbone to improve in preserving and reusing critical feature information. Additionally, convolutional block attention modules (CBAM) have been implemented to improve attention performance mechanisms for strong and discriminating deep spatial feature extraction that results in superior detection under various challenging environments. Moreover, additional feature fusion layers and a Swin-Transformer Prediction Head (SPH) have been added leveraging advanced self-attention mechanism for more efficient detection of multiscale object sizes and simultaneously reducing the computational complexity. Results: Evaluating the model performance in large-scale Road Damage Dataset (RDD-2018), at a detection rate of 62.4 FPS, DenseSPH-YOLOv5 obtains a mean average precision (mAP) value of 85.25 %, F1-score of 81.18 %, and precision (P) value of 89.51 % outperforming current state-of-the-art models. Significance: The present research provides an effective and efficient damage localization model addressing the shortcoming of existing DL-based damage detection models by providing highly accurate localized bounding box prediction. Current work constitutes a step towards an accurate and robust automated damage detection system in real-time in-field applications.

翻译：目的：基于计算机视觉的精确损伤分类与定位对于基础设施监测、安全及可维护性具有决定性意义。然而，当前基于深度学习（DL）的最先进损伤检测模型在复杂噪声环境中常缺乏卓越的特征提取能力，限制了准确可靠的目标区分性能。方法：为此，我们提出DenseSPH-YOLOv5——一种基于深度学习的实时高性能损伤检测模型，该模型将DenseNet块集成至骨干网络以增强关键特征信息的保持与复用能力。同时引入卷积块注意力模块（CBAM），通过强化深度空间特征提取的注意力机制，实现复杂环境下的卓越检测性能。此外，通过增加特征融合层及基于Swin-Transformer预测头（SPH），利用先进自注意力机制更高效地检测多尺度目标，并同步降低计算复杂度。结果：在大规模道路损伤数据集（RDD-2018）上，DenseSPH-YOLOv5以62.4 FPS的检测速度获得85.25%的平均精度均值（mAP）、81.18%的F1分数及89.51%的精确率（P），性能超越当前最先进模型。意义：本研究通过提供高精度定位边界框预测，有效解决了现有深度学习损伤检测模型的缺陷，为构建实际应用中实时、精准且鲁棒的自动化损伤检测系统迈出关键一步。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日