Objects rarely sit in isolation in human environments. As such, we'd like our robots to reason about how multiple objects relate to one another and how those relations may change as the robot interacts with the world. To this end, we propose a novel graph neural network framework for multi-object manipulation to predict how inter-object relations change given robot actions. Our model operates on partial-view point clouds and can reason about multiple objects dynamically interacting during the manipulation. By learning a dynamics model in a learned latent graph embedding space, our model enables multi-step planning to reach target goal relations. We show our model trained purely in simulation transfers well to the real world. Our planner enables the robot to rearrange a variable number of objects with a range of shapes and sizes using both push and pick and place skills.
翻译:物体在人类环境中很少孤立存在。因此,我们希望机器人能够推理多个物体之间的相互关系,以及这些关系如何随着机器人与世界的交互而改变。为此,我们提出一种面向多物体操控的新型图神经网络框架,用于预测机器人动作如何改变物体间关系。该模型基于局部视角点云,能够推理操控过程中多个物体的动态交互。通过在潜在图嵌入空间学习动力学模型,我们的模型能够实现多步规划,达到目标关系状态。实验表明,仅在仿真环境中训练的模型能够很好地迁移至真实世界。该规划器使机器人能够利用推、抓取与放置等技能,对形状和尺寸各异的可变数量物体进行重新排列。