Data-driven simulation has become a favorable way to train and test autonomous driving algorithms. The idea of replacing the actual environment with a learned simulator has also been explored in model-based reinforcement learning in the context of world models. In this work, we show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles. Existing data-driven traffic simulators are lacking configurability and scalability. To generate configurable behaviors, for each agent we introduce a destination as navigational information, and a time-invariant latent personality that specifies the behavioral style. To improve the scalability, we present a new scheme of positional encoding for angles, allowing all agents to share the same vectorized context and the use of an architecture based on dot-product attention. As a result, we can simulate all traffic participants seen in dense urban scenarios. Experiments on the Waymo open motion dataset show TrafficBots can simulate realistic multi-agent behaviors and achieve good performance on the motion prediction task.
翻译:数据驱动仿真已成为训练和测试自动驾驶算法的有效方法。在基于世界模型的模型强化学习中,用学习型仿真器替代真实环境的思想也已被探索。本文表明,数据驱动交通仿真可被建模为一种世界模型。我们提出TrafficBots——一种基于运动预测与端到端驾驶的多智能体策略,并基于TrafficBots构建了专用于自动驾驶车辆规划模块的世界模型。现有数据驱动交通仿真器缺乏可配置性与可扩展性。为生成可配置行为,我们为每个智能体引入作为导航信息的目的地,以及用于指定行为风格的时间不变潜在个性参数。为提升可扩展性,我们提出一种新的角度位置编码方案,使所有智能体能够共享相同的向量化上下文,并采用基于点积注意力的架构。由此,我们可模拟密集城区场景中的所有交通参与者。在Waymo开放运动数据集上的实验表明,TrafficBots能模拟逼真的多智能体行为,并在运动预测任务中取得良好性能。