In this paper, we present a generalizable model-free 6-DoF object pose estimator called Gen6D. Existing generalizable pose estimators either need high-quality object models or require additional depth maps or object masks in test time, which significantly limits their application scope. In contrast, our pose estimator only requires some posed images of the unseen object and is able to accurately predict the poses of the object in arbitrary environments. Gen6D consists of an object detector, a viewpoint selector and a pose refiner, all of which do not require the 3D object model and can generalize to unseen objects. Experiments show that Gen6D achieves state-of-the-art results on two model-free datasets: the MOPED dataset and a new GenMOP dataset collected by us. In addition, on the LINEMOD dataset, Gen6D achieves competitive results compared with instance-specific pose estimators. Project page: https://liuyuan-pal.github.io/Gen6D/.
翻译:本文提出了一种名为Gen6D的通用无模型6-DoF物体姿态估计方法。现有通用姿态估计器或需高质量物体模型,或需测试时额外提供深度图或物体掩码,严重限制了其应用范围。相比之下,我们的姿态估计器仅需未知物体的若干已标注姿态图像,即可在任意环境中准确预测该物体的姿态。Gen6D由物体检测器、视角选择器和姿态优化器三部分组成,三者均无需3D物体模型,且能泛化至未见物体。实验表明,Gen6D在两个无模型数据集(MOPED数据集及我们自行采集的新GenMOP数据集)上均取得了最先进结果。此外,在LINEMOD数据集上,Gen6D与实例专用姿态估计器相比也展现出具有竞争力的性能。项目页面:https://liuyuan-pal.github.io/Gen6D/。