Towards safe autonomous driving (AD), we consider the problem of learning models that accurately capture the diversity and tail quantiles of human driver behavior probability distributions, in interaction with an AD vehicle. Such models, which predict drivers' continuous actions from their states, are particularly relevant for closing the gap between AD agent simulations and reality. To this end, we adapt two flexible quantile learning frameworks for this setting that avoid strong distributional assumptions: (1) quantile regression (based on the titled absolute loss), and (2) autoregressive quantile flows (a version of normalizing flows). Training happens in a behavior cloning-fashion. We use the highD dataset consisting of driver trajectories on several highways. We evaluate our approach in a one-step acceleration prediction task, and in multi-step driver simulation rollouts. We report quantitative results using the tilted absolute loss as metric, give qualitative examples showing that realistic extremal behavior can be learned, and discuss the main insights.
翻译:为保障自动驾驶(AD)的安全性,我们研究了构建能够准确捕捉自动驾驶车辆交互情境下人类驾驶员行为概率分布多样性与尾部量化值的学习模型。此类模型通过驾驶员状态预测其连续动作,对于弥合AD智能体仿真与现实之间的差距至关重要。为此,我们针对该场景改进了两种灵活的分位数学习框架,以避免强分布假设:(1)分位数回归(基于倾斜绝对损失);(2)自回归分位数流(一种归一化流的变体)。训练过程采用行为克隆范式。我们使用包含多条高速公路驾驶员轨迹的highD数据集,在单步加速度预测任务与多步驾驶员仿真展开中评估所提方法。报告了以倾斜绝对损失为指标的定量结果,通过定性示例展示了可学习真实极端行为的能力,并讨论了核心见解。