In the field of robotic manipulation, deep imitation learning is recognized as a promising approach for acquiring manipulation skills. Additionally, learning from diverse robot datasets is considered a viable method to achieve versatility and adaptability. In such research, by learning various tasks, robots achieved generality across multiple objects. However, such multi-task robot datasets have mainly focused on single-arm tasks that are relatively imprecise, not addressing the fine-grained object manipulation that robots are expected to perform in the real world. This paper introduces a dataset of diverse object manipulations that includes dual-arm tasks and/or tasks requiring fine manipulation. To this end, we have generated dataset with 224k episodes (150 hours, 1,104 language instructions) which includes dual-arm fine tasks such as bowl-moving, pencil-case opening or banana-peeling, and this data is publicly available. Additionally, this dataset includes visual attention signals as well as dual-action labels, a signal that separates actions into a robust reaching trajectory and precise interaction with objects, and language instructions to achieve robust and precise object manipulation. We applied the dataset to our Dual-Action and Attention (DAA), a model designed for fine-grained dual arm manipulation tasks and robust against covariate shifts. The model was tested with over 7k total trials in real robot manipulation tasks, demonstrating its capability in fine manipulation. The dataset is available at https://sites.google.com/view/multi-task-fine.
翻译:在机器人操作领域,深度模仿学习被认为是获取操作技能的有效方法。此外,通过学习多样化的机器人数据集来实现通用性和适应性也被视为可行途径。此类研究通过让机器人学习各种任务,实现了跨多个物体的通用操作能力。然而,现有的大规模多任务机器人数据集主要集中于相对粗略的单臂任务,并未涉及机器人在现实世界中需要执行的精细物体操作。本文介绍了一个包含多种物体操作的数据集,其中涵盖双臂任务和/或需要精细操作的任务。为此,我们生成了一个包含22.4万条记录(150小时,1104条语言指令)的数据集,包括移碗、开笔盒、剥香蕉等双臂精细任务,该数据已公开可用。此外,该数据集还包含视觉注意力信号、双动作标签(将操作分为鲁棒到达轨迹和精确物体交互的信号)以及语言指令,以实现鲁棒且精确的物体操作。我们将该数据集应用于我们设计的双动作与注意力(DAA)模型,该模型专为精细双臂操作任务设计,并对协变量偏移具有鲁棒性。该模型在超过7000次真实机器人操作任务试验中进行了测试,展示了其在精细操作中的能力。数据集获取地址:https://sites.google.com/view/multi-task-fine。