3D reconstruction serves as the foundational layer for numerous robotic perception tasks, including 6D object pose estimation and grasp pose generation. Modern 3D reconstruction methods for objects can produce visually and geometrically impressive meshes from multi-view images, yet standard geometric evaluations do not reflect how reconstruction quality influences downstream tasks such as robotic manipulation performance. This paper addresses this gap by introducing a large-scale, physics-based benchmark that evaluates 6D pose estimators and 3D mesh models based on their functional efficacy in grasping. We analyze the impact of model fidelity by generating grasps on various reconstructed 3D meshes and executing them on the ground-truth model, simulating how grasp poses generated with an imperfect model affect interaction with the real object. This assesses the combined impact of pose error, grasp robustness, and geometric inaccuracies from 3D reconstruction. Our results show that reconstruction artifacts significantly decrease the number of grasp pose candidates but have a negligible effect on grasping performance given an accurately estimated pose. Our results also reveal that the relationship between grasp success and pose error is dominated by spatial error, and even a simple translation error provides insight into the success of the grasping pose of symmetric objects. This work provides insight into how perception systems relate to object manipulation using robots.
翻译:三维重建作为机器人感知任务的基础层,服务于包括六维物体姿态估计与抓取姿态生成在内的多种应用。现代物体三维重建方法能够从多视角图像生成视觉与几何质量俱佳的网格模型,然而标准几何评估无法反映重建质量如何影响下游任务(如机器人操作性能)。本文通过引入一个基于物理的大规模基准测试来填补这一空白,该基准根据六维姿态估计器与三维网格模型在抓取任务中的功能有效性进行评估。我们通过在不同重建三维网格上生成抓取姿态,并在真实模型上执行这些姿态,分析模型保真度的影响,从而模拟使用不完美模型生成的抓取姿态如何影响与真实物体的交互。这综合评估了姿态误差、抓取鲁棒性以及三维重建几何不准确性带来的影响。我们的结果表明,重建伪影显著减少了抓取姿态候选数量,但在姿态估计准确的前提下,对抓取性能的影响可忽略不计。结果还显示,抓取成功率与姿态误差的关系主要受空间误差主导,即使是简单的平移误差也能为对称物体的抓取姿态成功率提供有效洞察。本研究揭示了感知系统如何与机器人物体操作相关联的内在机制。