Signed distance functions (SDFs) is an attractive framework that has recently shown promising results for 3D shape reconstruction from images. SDFs seamlessly generalize to different shape resolutions and topologies but lack explicit modelling of the underlying 3D geometry. In this work, we exploit the hand structure and use it as guidance for SDF-based shape reconstruction. In particular, we address reconstruction of hands and manipulated objects from monocular RGB images. To this end, we estimate poses of hands and objects and use them to guide 3D reconstruction. More specifically, we predict kinematic chains of pose transformations and align SDFs with highly-articulated hand poses. We improve the visual features of 3D points with geometry alignment and further leverage temporal information to enhance the robustness to occlusion and motion blurs. We conduct extensive experiments on the challenging ObMan and DexYCB benchmarks and demonstrate significant improvements of the proposed method over the state of the art.
翻译:摘要:符号距离函数(SDFs)是一种极具吸引力的框架,近期在从图像重建三维形状方面展现出显著成果。SDFs能无缝泛化至不同形状分辨率与拓扑结构,但缺乏对底层三维几何结构的显式建模。本研究利用手部结构作为SDF形状重建的引导信号,重点解决从单目RGB图像中重建手部及操作物体的问题。为此,我们估计手部与物体的姿态,并以此引导三维重建。具体而言,我们预测姿态变换的运动链,将SDF与高自由度手部姿态对齐。通过几何对齐增强三维点的视觉特征,并进一步利用时序信息提升对遮挡与运动模糊的鲁棒性。我们在具有挑战性的ObMan与DexYCB基准上开展广泛实验,证明所提方法相较于现有技术取得显著改进。