Designing a safe and human-like decision-making system for an autonomous vehicle is a challenging task. Generative imitation learning is one possible approach for automating policy-building by leveraging both real-world and simulated decisions. Previous work that applies generative imitation learning to autonomous driving policies focuses on learning a low-level controller for simple settings. However, to scale to complex settings, many autonomous driving systems combine fixed, safe, optimization-based low-level controllers with high-level decision-making logic that selects the appropriate task and associated controller. In this paper, we attempt to bridge this gap in complexity by employing Safety-Aware Hierarchical Adversarial Imitation Learning (SHAIL), a method for learning a high-level policy that selects from a set of low-level controller instances in a way that imitates low-level driving data on-policy. We introduce an urban roundabout simulator that controls non-ego vehicles using real data from the Interaction dataset. We then demonstrate empirically that even with simple controller options, our approach can produce better behavior than previous approaches in driver imitation that have difficulty scaling to complex environments. Our implementation is available at https://github.com/sisl/InteractionImitation.
翻译:为自动驾驶车辆设计安全且类人的决策系统是一项具有挑战性的任务。生成式模仿学习是一种通过结合真实世界与模拟决策来自动化策略构建的可行方法。此前将生成式模仿学习应用于自动驾驶策略的研究主要聚焦于针对简单场景学习底层控制器。然而,为扩展到复杂场景,许多自动驾驶系统将固定、安全且基于优化的底层控制器与选择适当任务及其对应控制器的高层决策逻辑相结合。本文通过采用安全感知分层对抗模仿学习(SHAIL)来弥合这一复杂性差距,该方法能学习一种高层策略,以在策略上模仿底层驾驶数据的方式从一组底层控制器实例中进行选择。我们引入了一个城市环岛模拟器,利用Interaction数据集的真实数据控制非本车车辆。随后通过实验证明,即使仅使用简单的控制器选项,我们的方法也能在驾驶员模仿任务中产生优于此前难以扩展到复杂环境的方法的行为表现。我们的实现代码已开源至https://github.com/sisl/InteractionImitation。