The development of algorithms that learn multi-agent behavioral models using human demonstrations has led to increasingly realistic simulations in the field of autonomous driving. In general, such models learn to jointly predict trajectories for all controlled agents by exploiting road context information such as drivable lanes obtained from manually annotated high-definition (HD) maps. Recent studies show that these models can greatly benefit from increasing the amount of human data available for training. However, the manual annotation of HD maps which is necessary for every new location puts a bottleneck on efficiently scaling up human traffic datasets. We propose an aerial image-based map (AIM) representation that requires minimal annotation and provides rich road context information for traffic agents like pedestrians and vehicles. We evaluate multi-agent trajectory prediction using the AIM by incorporating it into a differentiable driving simulator as an image-texture-based differentiable rendering module. Our results demonstrate competitive multi-agent trajectory prediction performance especially for pedestrians in the scene when using our AIM representation as compared to models trained with rasterized HD maps.
翻译:利用人类示范学习多智能体行为模型的算法开发,推动了自动驾驶领域仿真模拟日益逼真。通常,此类模型通过利用道路上下文信息(例如从人工标注的高清地图中获取的可行驶车道)来联合预测所有受控智能体的轨迹。最新研究表明,增加训练可用的有效人类数据量能显著提升这些模型的性能。然而,每次在新地点部署时都需要人工标注高清地图,这成为高效扩展人类交通数据集的一大瓶颈。我们提出了一种基于航拍图像的地图(AIM)表示方法,该方法仅需极少量标注,即可为行人、车辆等交通参与者提供丰富的道路上下文信息。我们通过将该方法作为基于图像纹理的可微分渲染模块,集成到可微分驾驶模拟器中,评估了基于AIM的多智能体轨迹预测性能。实验结果表明,与使用栅格化高清地图训练的模型相比,采用我们提出的AIM表示方法时,尤其在场景中行人的多智能体轨迹预测性能上具有竞争力。