Learning from human demonstrations (behavior cloning) is a cornerstone of robot learning. However, most behavior cloning algorithms require a large number of demonstrations to learn a task, especially for general tasks that have a large variety of initial conditions. Humans, however, can learn to complete tasks, even complex ones, after only seeing one or two demonstrations. Our work seeks to emulate this ability, using behavior cloning to learn a task given only a single human demonstration. We achieve this goal by using linear transforms to augment the single demonstration, generating a set of trajectories for a wide range of initial conditions. With these demonstrations, we are able to train a behavior cloning agent to successfully complete three block manipulation tasks. Additionally, we developed a novel addition to the temporal ensembling method used by action chunking agents during inference. By incorporating the standard deviation of the action predictions into the ensembling method, our approach is more robust to unforeseen changes in the environment, resulting in significant performance improvements.
翻译:从人类示教中学习(行为克隆)是机器人学习的基石。然而,大多数行为克隆算法需要大量示教才能学会一项任务,特别是对于初始条件变化多样的通用任务。相比之下,人类仅需观察一两次示教就能学会完成任务,甚至包括复杂任务。本研究旨在模仿这种能力,通过行为克隆仅从单次人类示教学习任务。我们利用线性变换对单一示教进行增强,为广泛初始条件生成轨迹集合。基于这些示教,我们成功训练了一个行为克隆智能体以完成三项积木操作任务。此外,我们对动作分块智能体在推理阶段使用的时间集成方法进行了创新性改进。通过将动作预测的标准差纳入集成方法,我们的方法对环境中的意外变化具有更强的鲁棒性,从而显著提升了性能。