We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable a robot to articulate unseen classes of objects. We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning of the system to articulate the objects. To predict the object motions, we train a neural network to output a dense vector field representing the point-wise motion direction of the points in the point cloud under articulation. We then deploy an analytical motion planner based on this vector field to achieve a policy that yields maximum articulation. We train the vision system entirely in simulation, and we demonstrate the capability of our system to generalize to unseen object instances and novel categories in both simulation and the real world, deploying our policy on a Sawyer robot with no finetuning. Results show that our system achieves state-of-the-art performance in both simulated and real-world experiments.
翻译:我们探索了一种新颖方法,用于感知并操控三维铰接物体,该方法具有泛化能力,可使机器人操控未见过的物体类别。我们提出一种基于视觉的系统,通过学习预测各类铰接物体部件的潜在运动,以引导下游运动规划系统完成物体铰接操作。为预测物体运动,我们训练神经网络输出密集向量场,该向量场表示点云中每个点在铰接作用下的逐点运动方向。随后,基于该向量场部署解析运动规划器,以实现最大化铰接量的策略。我们完全在仿真环境中训练视觉系统,并在仿真与真实世界中展示该系统对未见物体实例及新类别的泛化能力——无需微调即可将策略部署至Sawyer机器人。实验结果表明,本方法在仿真与真实场景中均达到当前最优性能。