An Efficient Detection and Control System for Underwater Docking using Machine Learning and Realistic Simulation: A Comprehensive Approach

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Underwater docking is critical to enable the persistent operation of Autonomous Underwater Vehicles (AUVs). For this, the AUV must be capable of detecting and localizing the docking station, which is complex due to the highly dynamic undersea environment. Image-based solutions offer a high acquisition rate and versatile alternative to adapt to this environment; however, the underwater environment presents challenges such as low visibility, high turbidity, and distortion. In addition to this, field experiments to validate underwater docking capabilities can be costly and dangerous due to the specialized equipment and safety considerations required to conduct the experiments. This work compares different deep-learning architectures to perform underwater docking detection and classification. The architecture with the best performance is then compressed using knowledge distillation under the teacher-student paradigm to reduce the network's memory footprint, allowing real-time implementation. To reduce the simulation-to-reality gap, a Generative Adversarial Network (GAN) is used to do image-to-image translation, converting the Gazebo simulation image into a realistic underwater-looking image. The obtained image is then processed using an underwater image formation model to simulate image attenuation over distance under different water types. The proposed method is finally evaluated according to the AUV docking success rate and compared with classical vision methods. The simulation results show an improvement of 20% in the high turbidity scenarios regardless of the underwater currents. Furthermore, we show the performance of the proposed approach by showing experimental results on the off-the-shelf AUV Iver3.

翻译：水下对接对于实现自主水下航行器的持久运行至关重要。为此，AUV必须能够检测并定位对接站，而高度动态的水下环境使这一任务变得复杂。基于图像的解决方案提供了高采集率和适应环境的灵活替代方案；然而，水下环境存在低能见度、高浊度和畸变等挑战。此外，由于实验所需专用设备及安全考量，验证水下对接能力的现场实验成本高昂且危险。本研究比较了不同深度学习架构执行水下对接检测与分类的性能，随后采用师生范式下的知识蒸馏对性能最优的架构进行压缩，以减少网络内存占用，实现实时部署。为缩小仿真与现实的差距，使用生成对抗网络进行图像到图像的转换，将Gazebo仿真图像转化为逼真的水下场景图像，再通过水下图像形成模型处理所得图像，模拟不同水质类型下距离导致的图像衰减。最终根据AUV对接成功率评估所提方法，并与经典视觉方法进行对比。仿真结果表明，无论是否存在水下洋流，在高浊度场景下对接成功率提升了20%。此外，我们通过商用AUV Iver3的实验结果展示了所提方法的性能。