In the field of robotic manipulation, deep imitation learning is recognized as a promising approach for acquiring manipulation skills. Additionally, learning from diverse robot datasets is considered a viable method to achieve versatility and adaptability. In such research, by learning various tasks, robots achieved generality across multiple objects. However, such multi-task robot datasets have mainly focused on single-arm tasks that are relatively imprecise, not addressing the fine-grained object manipulation that robots are expected to perform in the real world. This paper introduces a dataset of diverse object manipulations that includes dual-arm tasks and/or tasks requiring fine manipulation. To this end, we have generated dataset with 224k episodes (150 hours, 1,104 language instructions) which includes dual-arm fine tasks such as bowl-moving, pencil-case opening or banana-peeling, and this data is publicly available. Additionally, this dataset includes visual attention signals as well as dual-action labels, a signal that separates actions into a robust reaching trajectory and precise interaction with objects, and language instructions to achieve robust and precise object manipulation. We applied the dataset to our Dual-Action and Attention (DAA), a model designed for fine-grained dual arm manipulation tasks and robust against covariate shifts. The model was tested with over 7k total trials in real robot manipulation tasks, demonstrating its capability in fine manipulation.
翻译:在机器人操作领域,深度模仿学习被认为是获取操作技能的一种有前景的方法。此外,从多样化的机器人数据集中学习被视为实现通用性和适应性的可行途径。在这类研究中,通过学习多种任务,机器人实现了跨多个对象的通用性。然而,这类多任务机器人数据集主要集中于相对不精确的单臂任务,并未涉及机器人在现实世界中预期执行的精细对象操作。本文介绍了一个包含多样化对象操作的数据集,涵盖了双臂任务和/或需要精细操作的任务。为此,我们生成了一个包含22.4万条演示(150小时,1104条语言指令)的数据集,其中包括诸如移动碗、打开铅笔盒或剥香蕉等双臂精细任务,且该数据已公开提供。此外,该数据集不仅包含视觉注意力信号和双动作标签(一种将动作分为鲁棒到达轨迹和精确对象交互的信号),还包含语言指令,以实现鲁棒且精确的对象操作。我们将该数据集应用于我们的双动作与注意力(DAA)模型,该模型专为精细双臂操作任务设计,并对协变量偏移具有鲁棒性。该模型在真实机器人操作任务中进行了超过7000次总试验测试,展示了其在精细操作中的能力。