Hierarchical reinforcement learning (HRL) is hypothesized to be able to take advantage of the inherent hierarchy in robot learning tasks with sparse reward schemes, in contrast to more traditional reinforcement learning algorithms. In this research, hierarchical reinforcement learning is evaluated and contrasted with standard reinforcement learning in complex navigation tasks. We evaluate unique characteristics of HRL, including their ability to create sub-goals and the termination function. We constructed experiments to test the differences between PPO and HRL, different ways of creating sub-goals, manual vs automatic sub-goal creation, and the effects of the frequency of termination on performance. These experiments highlight the advantages of HRL and how it achieves these advantages.
翻译:与传统的强化学习算法相比,分层强化学习(HRL)被认为能够利用机器人学习任务中固有的层次结构,尤其是在稀疏奖励方案中。本研究评估了分层强化学习在复杂导航任务中的表现,并将其与标准强化学习进行了对比。我们评估了HRL的独特特性,包括其创建子目标的能力以及终止函数的作用。我们设计了一系列实验,以测试PPO与HRL之间的差异、创建子目标的不同方法(手动与自动创建)、以及终止频率对性能的影响。这些实验凸显了HRL的优势及其实现这些优势的机制。