Recent works on 6D object pose estimation focus on learning keypoint correspondences between images and object models, and then determine the object pose through RANSAC-based algorithms or by directly regressing the pose with end-to-end optimisations. We argue that learning point-level discriminative features is overlooked in the literature. To this end, we revisit Fully Convolutional Geometric Features (FCGF) and tailor it for object 6D pose estimation to achieve state-of-the-art performance. FCGF employs sparse convolutions and learns point-level features using a fully-convolutional network by optimising a hardest contrastive loss. We can outperform recent competitors on popular benchmarks by adopting key modifications to the loss and to the input data representations, by carefully tuning the training strategies, and by employing data augmentations suitable for the underlying problem. We carry out a thorough ablation to study the contribution of each modification. The code is available at https://github.com/jcorsetti/FCGF6D.
翻译:近年来,六维物体姿态估计的研究工作主要聚焦于学习图像与物体模型之间的关键点对应关系,随后通过基于RANSAC的算法或采用端到端优化直接回归姿态来确定物体姿态。我们认为,文献中忽视了逐点级判别特征的学习。为此,我们重新审视了全卷积几何特征(FCGF),并将其定制用于物体六维姿态估计,以实现最先进的性能。FCGF采用稀疏卷积,并通过优化最困难对比损失的全卷积网络来学习逐点级特征。通过对损失函数和输入数据表示进行关键修改、精心调整训练策略,并应用适用于基础问题的数据增强方法,我们能够在主流基准测试中超越近期竞争者。我们进行了彻底的消融研究,以分析每项修改的贡献。代码地址为:https://github.com/jcorsetti/FCGF6D。