From dishwashers to cabinets, humans interact with articulated objects every day, and for a robot to assist in common manipulation tasks, it must learn a representation of articulation. Recent deep learning learning methods can provide powerful vision-based priors on the affordance of articulated objects from previous, possibly simulated, experiences. In contrast, many works estimate articulation by observing the object in motion, requiring the robot to already be interacting with the object. In this work, we propose to use the best of both worlds by introducing an online estimation method that merges vision-based affordance predictions from a neural network with interactive kinematic sensing in an analytical model. Our work has the benefit of using vision to predict an articulation model before touching the object, while also being able to update the model quickly from kinematic sensing during the interaction. In this paper, we implement a full system using shared autonomy for robotic opening of articulated objects, in particular objects in which the articulation is not apparent from vision alone. We implemented our system on a real robot and performed several autonomous closed-loop experiments in which the robot had to open a door with unknown joint while estimating the articulation online. Our system achieved an 80% success rate for autonomous opening of unknown articulated objects.
翻译:从洗碗机到橱柜,人类每天都会与铰接物体互动。为了让机器人协助完成常见操作任务,它必须学习对铰接结构的表征。近年来的深度学习方法可从先前(可能为模拟)经验中提供基于视觉的强先验信息,用于推断铰接物体的功能属性。相比之下,许多方法通过观察物体运动来估计铰接状态,这要求机器人已与物体发生交互。本研究提出融合两类方法优势的在线估计方案:将神经网络提供的视觉功能预测与交互式运动学感知相结合,构建解析模型。该方法兼具双重优势——既能在接触物体前通过视觉预测铰接模型,又能在交互期间通过运动学感知快速更新模型。本文利用共享自主权实现了完整系统,用于机器人自主开启铰接物体(尤其针对视觉难以直接辨识铰接特性的物体)。我们在实体机器人上完成系统部署,并开展多组自主闭环实验:机器人需在未知铰接类型门体的开合过程中,在线估计其铰接结构。实验结果显示,系统对未知铰接物体自主开启的成功率达到80%。