In this paper we investigate the effect of the unpredictability of surrounding cars on an ego-car performing a driving maneuver. We use Maximum Entropy Inverse Reinforcement Learning to model reward functions for an ego-car conducting a lane change in a highway setting. We define a new feature based on the unpredictability of surrounding cars and use it in the reward function. We learn two reward functions from human data: a baseline and one that incorporates our defined unpredictability feature, then compare their performance with a quantitative and qualitative evaluation. Our evaluation demonstrates that incorporating the unpredictability feature leads to a better fit of human-generated test data. These results encourage further investigation of the effect of unpredictability on driving behavior.
翻译:摘要:本文研究周围车辆的不可预测性对执行驾驶操作的自车的影响。我们采用最大熵逆强化学习为在高速公路场景中进行车道变更的自车建模奖励函数。基于周围车辆的不可预测性定义新特征,并将其纳入奖励函数。从人类数据中学习两种奖励函数:基线模型与包含我们定义的不可预测性特征的模型,随后通过定量与定性评估比较两者性能。评估结果表明,融入不可预测性特征后,模型对人类生成的测试数据拟合度更优。这些结果鼓励进一步探究不可预测性对驾驶行为的影响。