Object pose estimation is a non-trivial task that enables robotic manipulation, bin picking, augmented reality, and scene understanding, to name a few use cases. Monocular object pose estimation gained considerable momentum with the rise of high-performing deep learning-based solutions and is particularly interesting for the community since sensors are inexpensive and inference is fast. Prior works establish the comprehensive state of the art for diverse pose estimation problems. Their broad scopes make it difficult to identify promising future directions. We narrow down the scope to the problem of single-shot monocular 6D object pose estimation, which is commonly used in robotics, and thus are able to identify such trends. By reviewing recent publications in robotics and computer vision, the state of the art is established at the union of both fields. Following that, we identify promising research directions in order to help researchers to formulate relevant research ideas and effectively advance the state of the art. Findings include that methods are sophisticated enough to overcome the domain shift and that occlusion handling is a fundamental challenge. We also highlight problems such as novel object pose estimation and challenging materials handling as central challenges to advance robotics.
翻译:物体姿态估计是一项重要任务,可实现机器人操作、料箱抓取、增强现实及场景理解等应用场景。随着基于高性能深度学习方法的发展,单目物体姿态估计取得了显著进展,由于传感器成本低廉且推理速度快,这一方向尤其受到学界关注。先前的研究工作为各类姿态估计问题建立了全面的技术现状,但其广泛的研究范畴使得识别有前景的未来方向变得困难。本文将研究范围限定在机器人领域常用的单目单帧六维物体姿态估计问题,从而得以辨识此类发展趋势。通过系统梳理机器人与计算机视觉领域的最新文献,我们确立了两大领域交叉处的技术现状。在此基础上,我们识别出有前途的研究方向,以帮助研究者形成相关研究思路并有效推动技术发展。研究发现:现有方法已足够成熟以克服域偏移问题,而遮挡处理仍是根本性挑战。我们同时指出新型物体姿态估计与复杂材质处理等问题,是推动机器人技术发展的核心挑战。