Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of nuanced and complex multi-agent interactive behaviors. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simulation and testing. Waymax uses publicly-released, real-world driving data (e.g., the Waymo Open Motion Dataset) to initialize or play back a diverse set of multi-agent simulated scenarios. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training, making it suitable for modern large-scale, distributed machine learning workflows. To support online training and evaluation, Waymax includes several learned and hard-coded behavior models that allow for realistic interaction within simulation. To supplement Waymax, we benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions, where we highlight the effectiveness of routes as guidance for planning agents and the ability of RL to overfit against simulated agents.
翻译:仿真是以安全且经济高效的方式开发和评估自动驾驶规划软件的重要工具。然而,高保真仿真需要对微妙而复杂的多智能体交互行为进行精确建模。为应对这些挑战,我们提出Waymax——一种用于多智能体场景中自动驾驶的新型数据驱动仿真器,专为大规模仿真与测试而设计。Waymax利用公开的真实驾驶数据(例如Waymo开放运动数据集)初始化或回放多样化的多智能体仿真场景。其完全在TPU/GPU等硬件加速器上运行,并支持图内仿真以进行训练,因而适用于现代大规模分布式机器学习工作流。为支持在线训练与评估,Waymax包含多种学习型和硬编码行为模型,可在仿真中实现逼真的交互。作为Waymax的补充,我们通过消融实验对不同设计决策进行了基准测试,评估了多种主流的模仿学习与强化学习算法,结果凸显了路线作为规划智能体引导的有效性,以及强化学习在仿真中对智能体的过度拟合能力。