Recent advances in RGBD-based category-level object pose estimation have been limited by their reliance on precise depth information, restricting their broader applicability. In response, RGB-based methods have been developed. Among these methods, geometry-guided pose regression that originated from instance-level tasks has demonstrated strong performance. However, we argue that the NOCS map is an inadequate intermediate representation for geometry-guided pose regression method, as its many-to-one correspondence with category-level pose introduces redundant instance-specific information, resulting in suboptimal results. This paper identifies the intra-class variation problem inherent in pose regression based solely on the NOCS map and proposes the Intra-class Variation-Free Consensus (IVFC) map, a novel coordinate representation generated from the category-level consensus model. By leveraging the complementary strengths of the NOCS map and the IVFC map, we introduce GIVEPose, a framework that implements Gradual Intra-class Variation Elimination for category-level object pose estimation. Extensive evaluations on both synthetic and real-world datasets demonstrate that GIVEPose significantly outperforms existing state-of-the-art RGB-based approaches, achieving substantial improvements in category-level object pose estimation. Our code is available at https://github.com/ziqin-h/GIVEPose.
翻译:基于RGBD的类别级物体姿态估计的最新进展,因其对精确深度信息的依赖而受到限制,从而制约了其更广泛的应用。为此,基于RGB的方法应运而生。在这些方法中,源自实例级任务的几何引导姿态回归方法已展现出强大的性能。然而,我们认为,对于几何引导姿态回归方法而言,NOCS映射是一种不充分的中间表示,因为它与类别级姿态之间存在多对一的对应关系,引入了冗余的实例特定信息,导致结果欠佳。本文指出了仅基于NOCS映射的姿态回归所固有的类内差异问题,并提出了类内无差异共识(IVFC)映射,这是一种从类别级共识模型生成的新型坐标表示。通过利用NOCS映射和IVFC映射的互补优势,我们引入了GIVEPose框架,该框架实现了用于类别级物体姿态估计的渐进式类内差异消除。在合成数据集和真实世界数据集上的广泛评估表明,GIVEPose显著优于现有最先进的基于RGB的方法,在类别级物体姿态估计方面实现了实质性改进。我们的代码可在 https://github.com/ziqin-h/GIVEPose 获取。