In recent years, Deep Reinforcement Learning (DRL) has emerged as a promising method for robot collision avoidance. However, such DRL models often come with limitations, such as adapting effectively to structured environments containing various pedestrians. In order to solve this difficulty, previous research has attempted a few approaches, including training an end-to-end solution by integrating a waypoint planner with DRL and developing a multimodal solution to mitigate the drawbacks of the DRL model. However, these approaches have encountered several issues, including slow training times, scalability challenges, and poor coordination among different models. To address these challenges, this paper introduces a novel approach called evolutionary curriculum training to tackle these challenges. The primary goal of evolutionary curriculum training is to evaluate the collision avoidance model's competency in various scenarios and create curricula to enhance its insufficient skills. The paper introduces an innovative evaluation technique to assess the DRL model's performance in navigating structured maps and avoiding dynamic obstacles. Additionally, an evolutionary training environment generates all the curriculum to improve the DRL model's inadequate skills tested in the previous evaluation. We benchmark the performance of our model across five structured environments to validate the hypothesis that this evolutionary training environment leads to a higher success rate and a lower average number of collisions. Further details and results at our project website.
翻译:近年来,深度强化学习已成为机器人避障领域一种极具前景的方法。然而,此类深度强化学习模型通常存在局限性,例如难以有效适应包含多种行人的结构化环境。为解决这一难题,先前的研究尝试了若干方法,包括通过与路径规划器集成训练端到端解决方案,以及开发多模态方案以缓解深度强化学习模型的缺陷。但这些方法面临训练速度慢、可扩展性挑战以及不同模型间协调性差等问题。针对上述挑战,本文提出一种名为进化课程训练的新颖方法。进化课程训练的核心目标是评估碰撞规避模型在不同场景下的能力,并创建课程以增强其薄弱技能。本文提出了一种创新性评估技术,用于衡量深度强化学习模型在结构化地图导航与动态障碍物避让中的性能。此外,进化训练环境会生成所有课程,以提升先前评估中发现的深度强化学习模型不足技能。我们通过五个结构化环境对模型性能进行基准测试,验证了"进化训练环境可提高任务成功率并降低平均碰撞次数"这一假设。更多细节与结果可参见项目网站。