An Efficient Detection and Control System for Underwater Docking using Machine Learning and Realistic Simulation: A Comprehensive Approach

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Underwater docking is critical to enable the persistent operation of Autonomous Underwater Vehicles (AUVs). For this, the AUV must be capable of detecting and localizing the docking station, which is complex due to the highly dynamic undersea environment. Image-based solutions offer a high acquisition rate and versatile alternative to adapt to this environment; however, the underwater environment presents challenges such as low visibility, high turbidity, and distortion. In addition to this, field experiments to validate underwater docking capabilities can be costly and dangerous due to the specialized equipment and safety considerations required to conduct the experiments. This work compares different deep-learning architectures to perform underwater docking detection and classification. The architecture with the best performance is then compressed using knowledge distillation under the teacher-student paradigm to reduce the network's memory footprint, allowing real-time implementation. To reduce the simulation-to-reality gap, a Generative Adversarial Network (GAN) is used to do image-to-image translation, converting the Gazebo simulation image into a realistic underwater-looking image. The obtained image is then processed using an underwater image formation model to simulate image attenuation over distance under different water types. The proposed method is finally evaluated according to the AUV docking success rate and compared with classical vision methods. The simulation results show an improvement of 20% in the high turbidity scenarios regardless of the underwater currents. Furthermore, we show the performance of the proposed approach by showing experimental results on the off-the-shelf AUV Iver3.

翻译：水下对接对实现自主水下航行器的持续运行至关重要。为此，AUV必须能够检测并定位对接站，但由于高度动态的海底环境，这一过程较为复杂。基于图像的解决方案具有高采集速率和灵活适应性，能够适应这种环境；然而，水下环境存在低能见度、高浊度和图像畸变等挑战。此外，由于所需专业设备和安全考虑，现场实验验证水下对接能力可能成本高昂且存在风险。本研究比较了不同深度学习架构在水下对接检测与分类中的性能，随后采用知识蒸馏技术，基于教师-学生范式对性能最优的架构进行压缩，以减少网络内存占用，从而实现实时应用。为了缩小仿真与真实环境之间的差距，本文使用生成对抗网络进行图像到图像的转换，将Gazebo仿真图像转化为逼真的水下场景图像。随后，利用水下图像形成模型对所得图像进行处理，模拟不同水质类型下距离引起的光学衰减。最终，根据AUV对接成功率对所提方法进行评估，并与经典视觉方法进行比较。仿真结果表明，在无论是否存在水下流的高浊度场景中，该方法均实现了20%的性能提升。此外，我们通过商用AUV Iver3的实验结果展示了所提方法的实际表现。