In the field of robotic manipulation, deep imitation learning is recognized as a promising approach for acquiring manipulation skills. Additionally, learning from diverse robot datasets is considered a viable method to achieve versatility and adaptability. In such research, by learning various tasks, robots achieved generality across multiple objects. However, such multi-task robot datasets have mainly focused on single-arm tasks that are relatively imprecise, not addressing the fine-grained object manipulation that robots are expected to perform in the real world. This paper introduces a dataset of diverse object manipulations that includes dual-arm tasks and/or tasks requiring fine manipulation. To this end, we have generated dataset with 224k episodes (150 hours, 1,104 language instructions) which includes dual-arm fine tasks such as bowl-moving, pencil-case opening or banana-peeling, and this data is publicly available. Additionally, this dataset includes visual attention signals as well as dual-action labels, a signal that separates actions into a robust reaching trajectory and precise interaction with objects, and language instructions to achieve robust and precise object manipulation. We applied the dataset to our Dual-Action and Attention (DAA), a model designed for fine-grained dual arm manipulation tasks and robust against covariate shifts. The model was tested with over 7k total trials in real robot manipulation tasks, demonstrating its capability in fine manipulation.
翻译:在机器人操作领域,深度模仿学习被认为是获取操作技能的一种有前景的方法。此外,从多样化的机器人数据集中学习被视为实现通用性和适应性的可行途径。在这类研究中,通过多种任务的学习,机器人在多个物体上实现了通用性。然而,此类多任务机器人数据集主要集中在相对不精确的单臂任务上,未能解决机器人在现实世界中预期执行的精细物体操作。本文介绍了一个包含多样化物体操作的数据集,其中包括双臂任务和/或需要精细操作的任务。为此,我们生成了一个包含224k个片段(150小时,1,104条语言指令)的数据集,其中包括诸如移动碗、打开铅笔盒或剥香蕉等双臂精细任务,并且这些数据是公开可用的。此外,该数据集还包括视觉注意力信号、双动作标签(一种将动作分为稳健的到达轨迹和与物体的精确交互的信号)以及语言指令,以实现稳健且精确的物体操作。我们将该数据集应用于我们设计的双动作与注意力(DAA)模型,该模型专为精细的双臂操作任务而设计,并对协变量偏移具有鲁棒性。该模型在真实机器人操作任务中进行了超过7k次总试验的测试,证明了其在精细操作方面的能力。