We address the problem of keypoint selection, and find that the performance of 6DoF pose estimation methods can be improved when pre-defined keypoint locations are learned, rather than being heuristically selected as has been the standard approach. We found that accuracy and efficiency can be improved by training a graph network to select a set of disperse keypoints with similarly distributed votes. These votes, learned by a regression network to accumulate evidence for the keypoint locations, can be regressed more accurately compared to previous heuristic keypoint algorithms. The proposed KeyGNet, supervised by a combined loss measuring both Wasserstein distance and dispersion, learns the color and geometry features of the target objects to estimate optimal keypoint locations. Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy for all evaluation metrics of all seven datasets tested, for three keypoint voting methods. The challenging Occlusion LINEMOD dataset notably improved ADD(S) by +16.4% on PVN3D, and all core BOP datasets showed an AR improvement for all objects, of between +1% and +21.5%. There was also a notable increase in performance when transitioning from single object to multiple object training using KeyGNet keypoints, essentially eliminating the SISO-MIMO gap for Occlusion LINEMOD.
翻译:我们针对关键点选择问题展开研究,发现当预定义关键点位置通过学习而非传统启发式选择方法获取时,6DoF位姿估计方法的性能可得到提升。研究表明,通过训练图网络选择一组具有相似投票分布的分散关键点,能够同时提升准确性与效率。与现有启发式关键点算法相比,由回归网络学习得到的这些关键点投票可以更精确地累加证据用于定位关键点位置。本文提出的KeyGNet方法通过融合Wasserstein距离与分散度的联合损失函数进行监督,学习目标物体的颜色与几何特征以估计最优关键点位置。实验表明,在三种关键点投票机制下,KeyGNet选取的关键点使所有七个测试数据集的所有评估指标均获得精度提升。在最具挑战性的Occlusion LINEMOD数据集上,PVN3D方法的ADD(S)指标提升了+16.4%;所有核心BOP数据集的AR值在所有物体上的提升幅度介于+1%至+21.5%之间。值得注意的是,当从单物体训练转向多物体训练时,使用KeyGNet关键点可显著提升性能,基本消除了Occlusion LINEMOD数据集上的SISO-MIMO差距。