Imitation learning has gained immense popularity because of its high sample-efficiency. However, in real-world scenarios, where the trajectory distribution of most of the tasks dynamically shifts, model fitting on continuously aggregated data alone would be futile. In some cases, the distribution shifts, so much, that it is difficult for an agent to infer the new task. We propose a novel algorithm to generalize on any related task by leveraging prior knowledge on a set of specific tasks, which involves assigning importance weights to each past demonstration. We show experiments where the robot is trained from a diversity of environmental tasks and is also able to adapt to an unseen environment, using few-shot learning. We also developed a prototype robot system to test our approach on the task of visual navigation, and experimental results obtained were able to confirm these suppositions.
翻译:模仿学习因其高样本效率而广受欢迎。然而,在现实场景中,大多数任务的轨迹分布会动态变化,仅对连续聚合数据进行模型拟合将徒劳无功。在某些情况下,分布漂移如此严重,以至于智能体难以推断新任务。我们提出了一种新颖算法,通过利用特定任务集合的先验知识来泛化到任何相关任务,该算法为每个过往示范分配重要性权重。我们展示了实验:机器人通过多样化的环境任务进行训练,并能通过少样本学习适应未见过的环境。我们还开发了一个原型机器人系统,在视觉导航任务上测试了我们的方法,获得的实验结果证实了这些假设。