Object location prior is critical for the standard 6D object pose estimation setting. The prior can be used to initialize the 3D object translation and facilitate 3D object rotation estimation. Unfortunately, the object detectors that are used for this purpose do not generalize to unseen objects. Therefore, existing 6D pose estimation methods for unseen objects either assume the ground-truth object location to be known or yield inaccurate results when it is unavailable. In this paper, we address this problem by developing a method, LocPoseNet, able to robustly learn location prior for unseen objects. Our method builds upon a template matching strategy, where we propose to distribute the reference kernels and convolve them with a query to efficiently compute multi-scale correlations. We then introduce a novel translation estimator, which decouples scale-aware and scale-robust features to predict different object location parameters. Our method outperforms existing works by a large margin on LINEMOD and GenMOP. We further construct a challenging synthetic dataset, which allows us to highlight the better robustness of our method to various noise sources. Our project website is at: https://sailor-z.github.io/projects/3DV2024_LocPoseNet.html.
翻译:物体位置先验对于标准的6D物体姿态估计设置至关重要。该先验可用于初始化3D物体平移并促进3D物体旋转估计。然而,目前用于此目的的物体检测器无法泛化到未见物体。因此,现有的针对未见物体的6D姿态估计方法要么假设真实物体位置已知,要么在位置不可用时产生不准确的结果。本文通过开发一种名为LocPoseNet的方法来解决该问题,该方法能够鲁棒地学习未见物体的位置先验。我们的方法基于模板匹配策略,其中提出分布参考核并与查询进行卷积以高效计算多尺度相关性。随后,我们引入一种新颖的平移估计器,通过解耦尺度感知和尺度鲁棒特征来预测不同的物体位置参数。我们的方法在LINEMOD和GenMOP数据集上以大幅优势超越现有工作。此外,我们构建了一个具有挑战性的合成数据集,可突出展示我们的方法对各种噪声源具有更好的鲁棒性。项目网站地址为:https://sailor-z.github.io/projects/3DV2024_LocPoseNet.html。