We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable a robot to articulate unseen classes of objects. We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning of the system to articulate the objects. To predict the object motions, we train a neural network to output a dense vector field representing the point-wise motion direction of the points in the point cloud under articulation. We then deploy an analytical motion planner based on this vector field to achieve a policy that yields maximum articulation. We train the vision system entirely in simulation, and we demonstrate the capability of our system to generalize to unseen object instances and novel categories in both simulation and the real world, deploying our policy on a Sawyer robot with no finetuning. Results show that our system achieves state-of-the-art performance in both simulated and real-world experiments.
翻译:我们探索了一种新颖的方法来感知和操控3D铰接物体,该方法能够泛化以使机器人操控未见过的物体类别。我们提出一个基于视觉的系统,通过学习预测多种铰接物体各部分的潜在运动,来引导下游运动规划系统完成物体铰接操作。为了预测物体运动,我们训练一个神经网络输出稠密向量场,该向量场表示点云中各点在铰接作用下的逐点运动方向。随后,我们基于该向量场部署解析运动规划器,以生成最大化铰接程度的策略。我们完全在仿真环境中训练视觉系统,并展示了系统在仿真与现实环境中泛化至未见物体实例和新类别的能力——无需微调即可将策略部署于Sawyer机器人上。实验结果表明,我们的系统在仿真和真实世界实验中均达到了最先进性能。