This work develops a data-efficient learning from demonstration framework which exploits the use of rich tactile sensing and achieves fine dexterous bimanual manipulation. Specifically, we formulated a convolutional autoencoder network that can effectively extract and encode high-dimensional tactile information. Further, we developed a behaviour cloning network that can learn human-like sensorimotor skills demonstrated directly on the robot hardware in the task space by fusing both proprioceptive and tactile feedback. Our comparison study with the baseline method revealed the effectiveness of the contact information, which enabled successful extraction and replication of the demonstrated motor skills. Extensive experiments on real dual-arm robots demonstrated the robustness and effectiveness of the fine pinch grasp policy directly learned from one-shot demonstration, including grasping of the same object with different initial poses, generalizing to ten unseen new objects, robust and firm grasping against external pushes, as well as contact-aware and reactive re-grasping in case of dropping objects under very large perturbations. Moreover, the saliency map method is employed to describe the weight distribution across various modalities during pinch grasping. The video is available online at: \href{https://youtu.be/4Pg29bUBKqs}{https://youtu.be/4Pg29bUBKqs}.
翻译:本文开发了一种数据高效的演示学习框架,该框架利用丰富的触觉感知实现了精细的双灵巧操作。具体而言,我们构建了卷积自编码器网络,能够有效提取并编码高维触觉信息。此外,我们开发了行为克隆网络,通过融合本体感知和触觉反馈,能够在任务空间中直接学习在机器人硬件上演示的类人感觉运动技能。与基线方法的对比研究揭示了接触信息的有效性,使得成功提取和复现演示的运动技能成为可能。在真实双臂机器人上的大量实验证明了从一次性演示中直接学习的精细捏合抓取策略的鲁棒性和有效性,包括以不同初始姿态抓取同一物体、泛化至十个未见新物体、抵抗外部推力时的稳健抓取,以及在极大扰动下物体掉落时具有接触感知和响应性的重新抓取能力。此外,采用显著性图方法描述了捏合抓取过程中不同模态间的权重分布。视频在线地址:\href{https://youtu.be/4Pg29bUBKqs}{https://youtu.be/4Pg29bUBKqs}。