Reinforcement Learning (RL) has made promising progress in planning and decision-making for Autonomous Vehicles (AVs) in simple driving scenarios. However, existing RL algorithms for AVs fail to learn critical driving skills in complex urban scenarios. First, urban driving scenarios require AVs to handle multiple driving tasks of which conventional RL algorithms are incapable. Second, the presence of other vehicles in urban scenarios results in a dynamically changing environment, which challenges RL algorithms to plan the action and trajectory of the AV. In this work, we propose an action and trajectory planner using Hierarchical Reinforcement Learning (atHRL) method, which models the agent behavior in a hierarchical model by using the perception of the lidar and birdeye view. The proposed atHRL method learns to make decisions about the agent's future trajectory and computes target waypoints under continuous settings based on a hierarchical DDPG algorithm. The waypoints planned by the atHRL model are then sent to a low-level controller to generate the steering and throttle commands required for the vehicle maneuver. We empirically verify the efficacy of atHRL through extensive experiments in complex urban driving scenarios that compose multiple tasks with the presence of other vehicles in the CARLA simulator. The experimental results suggest a significant performance improvement compared to the state-of-the-art RL methods.
翻译:强化学习(RL)在简单驾驶场景中的自主车辆(AV)规划与决策方面已取得有希望的进展。然而,现有面向AV的强化学习算法在复杂城市场景中无法学习关键驾驶技能。首先,城市驾驶场景要求AV处理多种驾驶任务,而传统强化学习算法无法胜任。其次,城市场景中其他车辆的存在导致环境动态变化,这对强化学习算法规划AV动作与轨迹构成挑战。本文提出一种基于分层强化学习的动作与轨迹规划方法(atHRL),该方法通过利用激光雷达和鸟瞰视角的感知,在分层模型中建模智能体行为。所提出的atHRL方法基于分层DDPG算法学习对智能体未来轨迹进行决策,并在连续设定下计算目标路径点。随后,atHRL模型规划的路径点被发送至底层控制器,以生成车辆操控所需的转向和油门指令。我们通过在CARLA模拟器中包含多辆车的多种复杂城市驾驶场景进行大量实验,实证验证了atHRL的有效性。实验结果表明,与当前最优的强化学习方法相比,其性能有显著提升。