We develop a deep reinforcement learning framework for tactical decision making in an autonomous truck, specifically for Adaptive Cruise Control (ACC) and lane change maneuvers in a highway scenario. Our results demonstrate that it is beneficial to separate high-level decision-making processes and low-level control actions between the reinforcement learning agent and the low-level controllers based on physical models. In the following, we study optimizing the performance with a realistic and multi-objective reward function based on Total Cost of Operation (TCOP) of the truck using different approaches; by adding weights to reward components, by normalizing the reward components and by using curriculum learning techniques.
翻译:我们开发了一种面向自动驾驶卡车战术决策的深度强化学习框架,专门处理高速公路场景下的自适应巡航控制(ACC)与车道变换操作。研究结果表明,将高层决策过程与底层控制动作分别交由强化学习智能体和基于物理模型的底层控制器执行具有显著优势。在此基础上,我们通过以下不同方法探索基于卡车总运营成本(TCOP)的现实多目标奖励函数优化:对奖励分量添加权重、对奖励分量进行归一化处理,以及采用课程学习技术。