Traditional geometric registration based estimation methods only exploit the CAD model implicitly, which leads to their dependence on observation quality and deficiency to occlusion.To address the problem,the paper proposes a bidirectional correspondence prediction network with a point-wise attention-aware mechanism. This network not only requires the model points to predict the correspondence but also explicitly models the geometric similarities between observations and the model prior.} Our key insight is that the correlations between each model point and scene point provide essential information for learning point-pair matches. To further tackle the correlation noises brought by feature distribution divergence, we design a simple but effective pseudo-siamese network to improve feature homogeneity.Experimental results on the public datasets of LineMOD, YCB-Video, and Occ-LineMOD show that the proposed method achieves better performance than other state-of-the-art methods under the same evaluation criteria. Its robustness in estimating poses is greatly improved, especially in an environment with severe occlusions.
翻译:传统的基于几何配准的估计方法仅隐式利用CAD模型,导致其过度依赖观测质量且难以应对遮挡问题。针对该问题,本文提出一种具有逐点注意力感知机制的双向对应预测网络。该网络不仅要求模型点预测对应关系,还能显式建模观测数据与模型先验之间的几何相似性。我们的关键见解在于:每个模型点与场景点之间的相关性为学习点对匹配提供了重要信息。为进一步处理因特征分布差异带来的相关性噪声,我们设计了一种简单而有效的伪孪生网络来提升特征同质性。在LineMOD、YCB-Video和Occ-LineMOD公开数据集上的实验结果表明,在相同评估标准下,所提方法优于现有最优方法。该方法在姿态估计的鲁棒性方面获得显著提升,尤其在严重遮挡环境下表现突出。