The high cost of real-world data for robotics Reinforcement Learning (RL) leads to the wide usage of simulators. Despite extensive work on building better dynamics models for simulators to match with the real world, there is another, often-overlooked mismatch between simulations and the real world, namely the distribution of available training tasks. Such a mismatch is further exacerbated by existing curriculum learning techniques, which automatically vary the simulation task distribution without considering its relevance to the real world. Considering these challenges, we posit that curriculum learning for robotics RL needs to be grounded in real-world task distributions. To this end, we propose Grounded Curriculum Learning (GCL), which aligns the simulated task distribution in the curriculum with the real world, as well as explicitly considers what tasks have been given to the robot and how the robot has performed in the past. We validate GCL using the BARN dataset on complex navigation tasks, achieving a 6.8% and 6.5% higher success rate compared to a state-of-the-art CL method and a curriculum designed by human experts, respectively. These results show that GCL can enhance learning efficiency and navigation performance by grounding the simulation task distribution in the real world within an adaptive curriculum.
翻译:机器人强化学习(RL)中真实世界数据的高昂成本导致模拟器的广泛使用。尽管已有大量工作致力于构建更好的动力学模型以使模拟器与真实世界相匹配,但模拟环境与真实世界之间还存在另一个常被忽视的差异,即可用训练任务的分布。现有的课程学习技术会自动化地改变模拟任务分布而不考虑其与真实世界的相关性,这进一步加剧了这种不匹配。面对这些挑战,我们认为机器人强化学习的课程学习需要基于真实世界的任务分布进行构建。为此,我们提出了基于真实环境的课程学习(GCL),该方法使课程中的模拟任务分布与真实世界对齐,并明确考虑已分配给机器人的任务以及机器人过去的表现。我们在复杂导航任务的BARN数据集上验证了GCL,与最先进的课程学习方法以及由人类专家设计的课程相比,成功率分别提高了6.8%和6.5%。这些结果表明,通过在自适应课程中将模拟任务分布基于真实世界,GCL能够提升学习效率和导航性能。