We propose WayEx, a new method for learning complex goal-conditioned robotics tasks from a single demonstration. Our approach distinguishes itself from existing imitation learning methods by demanding fewer expert examples and eliminating the need for information about the actions taken during the demonstration. This is accomplished by introducing a new reward function and employing a knowledge expansion technique. We demonstrate the effectiveness of WayEx, our waypoint exploration strategy, across six diverse tasks, showcasing its applicability in various environments. Notably, our method significantly reduces training time by 50% as compared to traditional reinforcement learning methods. WayEx obtains a higher reward than existing imitation learning methods given only a single demonstration. Furthermore, we demonstrate its success in tackling complex environments where standard approaches fall short. More information is available at: https://waypoint-ex.github.io.
翻译:我们提出WayEx,一种从单次演示中学习复杂目标条件机器人任务的新方法。与现有模仿学习方法相比,我们的方法仅需更少的专家示例,且无需演示过程中的动作信息。这是通过引入新的奖励函数并采用知识扩展技术实现的。我们在六个不同任务中验证了WayEx(我们的路径点探索策略)的有效性,展示了其在多样化环境中的适用性。值得注意的是,与传统强化学习方法相比,我们的方法将训练时间显著减少了50%。在仅提供单次演示的情况下,WayEx获得的奖励高于现有模仿学习方法。此外,我们证明了该方法能成功应对标准方法难以处理的复杂环境。更多信息请访问:https://waypoint-ex.github.io。