Realistic and interactive traffic simulation is essential for training and evaluating autonomous driving systems. However, most existing data-driven simulation methods rely on static initialization or log-replay data, limiting their ability to model dynamic, long-horizon scenarios with evolving agent populations. We propose SceneStreamer, a unified autoregressive framework for continuous scenario generation that represents the entire scene as a sequence of tokens, including traffic light signals, agent states, and motion vectors, and generates them step by step with a transformer model. This design enables SceneStreamer to continuously introduce and retire agents over an unbounded horizon, supporting realistic long-duration simulation. Experiments demonstrate that SceneStreamer produces realistic, diverse, and adaptive traffic behaviors. Furthermore, reinforcement learning policies trained in SceneStreamer-generated scenarios achieve superior robustness and generalization, validating its utility as a high-fidelity simulation environment for autonomous driving. More information is available at https://vail-ucla.github.io/scenestreamer/ .
翻译:真实且交互式的交通仿真对于自动驾驶系统的训练与评估至关重要。然而,现有的大多数数据驱动仿真方法依赖于静态初始化或日志回放数据,限制了其对具有动态演化智能体群体的、长时域场景的建模能力。我们提出了SceneStreamer,一个用于连续场景生成的统一自回归框架。该框架将整个场景表示为一个令牌序列,包括交通灯信号、智能体状态和运动向量,并使用Transformer模型逐步生成这些令牌。这种设计使得SceneStreamer能够在无界时间范围内持续引入和移除智能体,支持真实的长时程仿真。实验表明,SceneStreamer能够产生真实、多样且自适应的交通行为。此外,在SceneStreamer生成的场景中训练的强化学习策略表现出卓越的鲁棒性和泛化能力,验证了其作为高保真自动驾驶仿真环境的实用性。更多信息请访问 https://vail-ucla.github.io/scenestreamer/ 。