Generating realistic and controllable agent behaviors in traffic simulation is crucial for the development of autonomous vehicles. This problem is often formulated as imitation learning (IL) from real-world driving data by either directly predicting future trajectories or inferring cost functions with inverse optimal control. In this paper, we draw a conceptual connection between IL and diffusion-based generative modeling and introduce a novel framework Versatile Behavior Diffusion (VBD) to simulate interactive scenarios with multiple traffic participants. Our model not only generates scene-consistent multi-agent interactions but also enables scenario editing through multi-step guidance and refinement. Experimental evaluations show that VBD achieves state-of-the-art performance on the Waymo Sim Agents benchmark. In addition, we illustrate the versatility of our model by adapting it to various applications. VBD is capable of producing scenarios conditioning on priors, integrating with model-based optimization, sampling multi-modal scene-consistent scenarios by fusing marginal predictions, and generating safety-critical scenarios when combined with a game-theoretic solver.
翻译:在交通模拟中生成真实且可控的智能体行为对于自动驾驶技术发展至关重要。该问题通常被表述为从真实驾驶数据中学习的模仿学习,其方法包括直接预测未来轨迹或通过逆向最优控制推断代价函数。本文揭示了模仿学习与基于扩散的生成式建模之间的概念联系,并提出了一个新颖框架——通用行为扩散模型,用于模拟包含多个交通参与者的交互场景。该模型不仅能生成场景一致的多智能体交互,还可通过多步引导与优化实现场景编辑。实验评估表明,VBD在Waymo Sim Agents基准测试中达到了最先进的性能。此外,我们通过将该模型适配至多种应用场景,展示了其通用性:VBD能够基于先验条件生成场景、集成模型优化方法、融合边际预测采样多模态场景一致场景,以及与博弈论求解器结合生成关键安全场景。