Underwater robotic vision encounters significant challenges, necessitating advanced solutions to enhance performance and adaptability. This paper presents MARS (Multi-Scale Adaptive Robotics Vision), a novel approach to underwater object detection tailored for diverse underwater scenarios. MARS integrates Residual Attention YOLOv3 with Domain-Adaptive Multi-Scale Attention (DAMSA) to enhance detection accuracy and adapt to different domains. During training, DAMSA introduces domain class-based attention, enabling the model to emphasize domain-specific features. Our comprehensive evaluation across various underwater datasets demonstrates MARS's performance. On the original dataset, MARS achieves a mean Average Precision (mAP) of 58.57\%, showcasing its proficiency in detecting critical underwater objects like echinus, starfish, holothurian, scallop, and waterweeds. This capability holds promise for applications in marine robotics, marine biology research, and environmental monitoring. Furthermore, MARS excels at mitigating domain shifts. On the augmented dataset, which incorporates all enhancements (+Domain +Residual+Channel Attention+Multi-Scale Attention), MARS achieves an mAP of 36.16\%. This result underscores its robustness and adaptability in recognizing objects and performing well across a range of underwater conditions. The source code for MARS is publicly available on GitHub at https://github.com/LyesSaadSaoud/MARS-Object-Detection/
翻译:水下机器视觉面临显著挑战,亟需先进解决方案以提升性能与适应性。本文提出MARS(多尺度自适应机器人视觉),一种针对多样化水下场景定制的新型水下目标检测方法。MARS融合残差注意力YOLOv3与域自适应多尺度注意力(DAMSA),旨在增强检测精度并适应不同领域。训练过程中,DAMSA引入领域类别注意力机制,使模型能够聚焦领域特异性特征。我们在多个水下数据集上的综合评估验证了MARS的性能。在原始数据集上,MARS达到58.57%的平均精度均值(mAP),展现出检测海胆、海星、海参、扇贝及水草等关键水下物体的卓越能力,该能力有望应用于海洋机器人、海洋生物学研究及环境监测领域。此外,MARS在缓解领域偏移方面表现突出。在集成所有增强模块(+域+残差+通道注意力+多尺度注意力)的增强数据集上,MARS取得36.16%的mAP,凸显了其在多种水下条件下识别物体及保持优异性能的鲁棒性与适应性。MARS源代码已公开于GitHub:https://github.com/LyesSaadSaoud/MARS-Object-Detection/