Understanding and manipulating articulated objects, such as doors and drawers, is crucial for robots operating in human environments. We wish to develop a system that can learn to articulate novel objects with no prior interaction, after training on other articulated objects. Previous approaches for articulated object manipulation rely on either modular methods which are brittle or end-to-end methods, which lack generalizability. This paper presents FlowBot++, a deep 3D vision-based robotic system that predicts dense per-point motion and dense articulation parameters of articulated objects to assist in downstream manipulation tasks. FlowBot++ introduces a novel per-point representation of the articulated motion and articulation parameters that are combined to produce a more accurate estimate than either method on their own. Simulated experiments on the PartNet-Mobility dataset validate the performance of our system in articulating a wide range of objects, while real-world experiments on real objects' point clouds and a Sawyer robot demonstrate the generalizability and feasibility of our system in real-world scenarios.
翻译:理解和操作铰接物体(如门和抽屉)对于在人类环境中运行的机器人至关重要。我们希望开发一个系统,能够在对其他铰接物体进行训练后,无需预先交互即可学习操作新型铰接物体。此前针对铰接物体操作的方法要么依赖脆弱的模块化方法,要么依赖缺乏泛化能力的端到端方法。本文提出FlowBot++,一种基于深度三维视觉的机器人系统,可预测铰接物体的逐点稠密运动及稠密关节参数,以辅助下游操作任务。FlowBot++创新性地将铰接运动与关节参数的逐点表征相结合,其联合估计精度优于单独使用任一种方法。基于PartNet-Mobility数据集的仿真实验验证了该系统在操作多样化物体上的性能,而针对真实物体点云及Sawyer机器人的实际实验则证明了该系统在真实场景中的泛化能力和可行性。