Out of the many deep reinforcement learning approaches for autonomous driving, only few make use of the options (or skills) framework. That is surprising, as this framework is naturally suited for hierarchical control applications in general, and autonomous driving tasks in specific. Therefore, in this work the options framework is applied and tailored to autonomous driving tasks on highways. More specifically, we define dedicated options for longitudinal and lateral manoeuvres with embedded safety and comfort constraints. This way, prior domain knowledge can be incorporated into the learning process and the learned driving behaviour can be constrained more easily. We propose several setups for hierarchical control with options and derive practical algorithms following state-of-the-art reinforcement learning techniques. By separately selecting actions for longitudinal and lateral control, the introduced policies over combined and hybrid options obtain the same expressiveness and flexibility that human drivers have, while being easier to interpret than classical policies over continuous actions. Of all the investigated approaches, these flexible policies over hybrid options perform the best under varying traffic conditions, outperforming the baseline policies over actions.
翻译:在众多用于自动驾驶的深度强化学习方法中,仅有少数采用了选项(或技能)框架。这令人惊讶,因为该框架天然适用于分层控制应用,尤其是自动驾驶任务。因此,本研究将选项框架应用于高速公路自动驾驶任务并进行了针对性设计。具体而言,我们为纵向和横向机动定义了具有嵌入式安全与舒适约束的专用选项。通过这种方式,可以将先验领域知识融入学习过程,并更容易约束学习到的驾驶行为。我们提出了几种基于选项的分层控制方案,并基于前沿强化学习技术推导出实用算法。通过分别选择纵向和横向控制动作,所引入的基于组合与混合选项的策略获得了与人类驾驶员相同的表达能力和灵活性,同时比传统连续动作策略更易于解释。在所有研究的方法中,这种基于混合选项的灵活策略在不同交通条件下表现最佳,其性能超越了基于动作的基准策略。