Object pose estimation underwater allows an autonomous system to perform tracking and intervention tasks. Nonetheless, underwater target pose estimation is remarkably challenging due to, among many factors, limited visibility, light scattering, cluttered environments, and constantly varying water conditions. An approach is to employ sonar or laser sensing to acquire 3D data, but besides being costly, the resulting data is normally noisy. For this reason, the community has focused on extracting pose estimates from RGB input. However, the literature is scarce and exhibits low detection accuracy. In this work, we propose an approach consisting of a 2D object detection and a 6D pose estimation that reliably obtains object poses in different underwater scenarios. To test our pipeline, we collect and make available a dataset of 4 objects in 10 different real scenes with annotations for object detection and pose estimation. We test our proposal in real and synthetic settings and compare its performance with similar end-to-end methodologies for 6D object pose estimation. Our dataset contains some challenging objects with symmetrical shapes and poor texture. Regardless of such object characteristics, our proposed method outperforms stat-of-the-art pose accuracy by ~8%. We finally demonstrate the reliability of our pose estimation pipeline by doing experiments with an underwater manipulation in a reaching task.
翻译:水下目标位姿估计使自主系统能够执行跟踪与干预任务。然而,由于能见度有限、光线散射、环境杂乱以及水体条件持续变化等多重因素,水下目标位姿估计极具挑战性。现有方法多采用声纳或激光传感获取三维数据,但除了成本高昂外,所获数据通常包含大量噪声。为此,学界已聚焦于从RGB输入中提取位姿估计,但相关文献稀少且检测精度较低。本文提出一种结合二维目标检测与六维位姿估计的方法,可在不同水下场景中可靠获取目标位姿。为验证该流水线,我们收集并公开了一个包含10个真实场景中4种目标及其目标检测与位姿标注的数据集。我们在真实与合成环境中测试了所提方法,并与同类端到端六维目标位姿估计方法进行性能对比。该数据集包含部分具有对称形状与贫纹理的挑战性目标,但所提方法仍能超越此类目标特征限制,以约8%的精度提升超越当前最优位姿估计精度。最后,通过水下机械臂抓取实验验证了该位姿估计流水线的可靠性。