Existing methods for reconstructing interactive scenes primarily focus on replacing reconstructed objects with CAD models retrieved from a limited database, resulting in significant discrepancies between the reconstructed and observed scenes. To address this issue, our work introduces a part-level reconstruction approach that reassembles objects using primitive shapes. This enables us to precisely replicate the observed physical scenes and simulate robot interactions with both rigid and articulated objects. By segmenting reconstructed objects into semantic parts and aligning primitive shapes to these parts, we assemble them as CAD models while estimating kinematic relations, including parent-child contact relations, joint types, and parameters. Specifically, we derive the optimal primitive alignment by solving a series of optimization problems, and estimate kinematic relations based on part semantics and geometry. Our experiments demonstrate that part-level scene reconstruction outperforms object-level reconstruction by accurately capturing finer details and improving precision. These reconstructed part-level interactive scenes provide valuable kinematic information for various robotic applications; we showcase the feasibility of certifying mobile manipulation planning in these interactive scenes before executing tasks in the physical world.
翻译:现有交互场景重建方法主要侧重于用有限数据库中的CAD模型替换重建物体,导致重建场景与观测场景之间存在显著差异。为解决此问题,本文提出一种部分级重建方法,利用基本几何体素重新组装物体。该方法能够精确复现观测到的物理场景,并模拟机器人对刚体和铰接物体的交互。通过将重建物体分割为语义部件,并为这些部件对齐基本几何体素,我们将它们组装为CAD模型,同时估计运动学关系(包括父子接触关系、关节类型及参数)。具体而言,我们通过求解一系列优化问题推导最优几何体素对齐,并基于部件语义和几何特性估计运动学关系。实验表明,部分级场景重建通过精确捕捉更精细细节并提升精度,优于物体级重建方法。这些重建的部分级交互场景为各类机器人应用提供了关键的运动学信息;我们展示了在物理世界执行任务前,于此类交互场景中验证移动操作规划可行性的能力。