In this paper, we tackle the challenging problem of 3D keypoint estimation of general objects using a novel implicit representation. Previous works have demonstrated promising results for keypoint prediction through direct coordinate regression or heatmap-based inference. However, these methods are commonly studied for specific subjects, such as human bodies and faces, which possess fixed keypoint structures. They also suffer in several practical scenarios where explicit or complete geometry is not given, including images and partial point clouds. Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints. Specifically, our key idea is employing spheres to represent 3D keypoints, thereby enabling the learnability of the corresponding signed distance field. Explicit keypoints can be extracted subsequently by our algorithm based on the Hough transform. Quantitative and qualitative evaluations also show the superiority of our representation in terms of prediction accuracy.
翻译:本文提出了一种基于新型隐式表示的方法,用于解决通用物体3D关键点估计这一具有挑战性的问题。先前的研究通过直接坐标回归或基于热图的推理方法,在关键点预测方面展示了令人期待的结果。然而,这些方法通常针对具有固定关键点结构的具体对象(如人体和人脸)进行研究,并且在图像及部分点云等未提供显式或完整几何信息的实际场景中表现欠佳。受近期先进隐式表示在重建任务中取得成功的启发,我们探索了利用隐式场表示关键点的思路。具体而言,我们的核心思想是采用球体表示3D关键点,从而使得对应有符号距离场具备可学习性。随后,可基于霍夫变换的算法提取显式关键点。定量与定性评估均表明,本表示方法在预测精度方面具有显著优势。