We investigate the impact of pre-defined keypoints for pose estimation, and found that accuracy and efficiency can be improved by training a graph network to select a set of disperse keypoints with similarly distributed votes. These votes, learned by a regression network to accumulate evidence for the keypoint locations, can be regressed more accurately compared to previous heuristic keypoint algorithms. The proposed KeyGNet, supervised by a combined loss measuring both Wassserstein distance and dispersion, learns the color and geometry features of the target objects to estimate optimal keypoint locations. Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy for all evaluation metrics of all seven datasets tested, for three keypoint voting methods. The challenging Occlusion LINEMOD dataset notably improved ADD(S) by +16.4% on PVN3D, and all core BOP datasets showed an AR improvement for all objects, of between +1% and +21.5%. There was also a notable increase in performance when transitioning from single object to multiple object training using KeyGNet keypoints, essentially eliminating the SISO-MIMO gap for Occlusion LINEMOD.
翻译:我们研究了预定义关键点对姿态估计的影响,发现通过训练图网络选择一组分散且投票分布相似的关键点,可以提升准确性和效率。这些由回归网络学习的投票用于累积关键点位置的证据,与先前的启发式关键点算法相比,其回归结果更为精准。所提出的KeyGNet在结合Wasserstein距离与离散度的联合损失监督下,学习目标物体的颜色与几何特征,从而估计最优关键点位置。实验表明,KeyGNet选择的关键点使所有七种数据集的各项评估指标均有所提升,涵盖三种关键点投票方法。在挑战性较高的Occlusion LINEMOD数据集中,PVN3D上的ADD(S)指标显著提升了+16.4%,而所有核心BOP数据集上各物体的AR改进幅度介于+1%至+21.5%之间。此外,当从单目标训练过渡到使用KeyGNet关键点的多目标训练时,性能显著提升,基本消除了Occlusion LINEMOD上的SISO-MIMO差距。