Reinforcement Learning is a highly active research field with promising advancements. In the field of autonomous driving, however, often very simple scenarios are being examined. Common approaches use non-interpretable control commands as the action space and unstructured reward designs which lack structure. In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is integrated as a knowledge source. We learn trajectories and asses them with a situation-aware reward design, leading to a dynamic reward which allows the agent to learn situations which require controlled traffic rule exceptions. Our method is applicable to arbitrary RL models. We successfully demonstrate high completion rates of complex scenarios with recent model-based agents.
翻译:强化学习是一个极具活跃性的研究领域,并取得了令人瞩目的进展。然而,在自动驾驶领域,通常只考察非常简单的场景。常见的方法采用不可解释的控制指令作为动作空间,并采用缺乏结构性的非结构化奖励设计。在本文中,我们提出了通知强化学习,其中将结构化规则手册作为知识源进行整合。我们学习轨迹,并通过考虑情境的奖励设计对其进行评估,从而产生动态奖励,使智能体能够学习需要受控的交通规则异常的交通情境。我们的方法适用于任意强化学习模型。我们通过近期基于模型的智能体成功展示了复杂场景的高完成率。