Traditional trajectory planning methods for autonomous vehicles have several limitations. For example, heuristic and explicit simple rules limit generalizability and hinder complex motions. These limitations can be addressed using reinforcement learning-based trajectory planning. However, reinforcement learning suffers from unstable learning and existing reinforcement learning-based trajectory planning methods do not consider the uncertainties. Thus, this paper, proposes a reinforcement learning-based trajectory planning method for autonomous vehicles. The proposed method includes an iterative reward prediction method that stabilizes the learning process, and an uncertainty propagation method that makes the reinforcement learning agent aware of uncertainties. The proposed method was evaluated using the CARLA simulator. Compared to the baseline methods, the proposed method reduced the collision rate by 60.17%, and increased the average reward by 30.82 times. A video of the proposed method is available at https://www.youtube.com/watch?v=PfDbaeLfcN4.
翻译:传统自动驾驶车辆轨迹规划方法存在若干局限。例如,基于启发式和显式简单规则的方法限制了泛化能力,并且难以处理复杂运动。强化学习轨迹规划方法可解决这些问题,但强化学习存在学习不稳定的问题,且现有基于强化学习的轨迹规划方法未考虑不确定性。因此,本文提出一种基于强化学习的自动驾驶车辆轨迹规划方法,该方法包含一种使学习过程稳定的迭代奖励预测方法,以及一种使强化学习代理感知不确定性的不确定性传播方法。通过CARLA模拟器对所提方法进行验证。与基线方法相比,所提方法将碰撞率降低60.17%,平均奖励提升30.82倍。所提方法的视频演示参见https://www.youtube.com/watch?v=PfDbaeLfcN4。