Realistic traffic simulation is critical for the development of autonomous driving systems and urban mobility planning, yet existing imitation learning approaches often fail to model realistic traffic behaviors. Behavior cloning suffers from covariate shift, while Generative Adversarial Imitation Learning (GAIL) is notoriously unstable in multi-agent settings. We identify a key source of this instability: irrelevant interaction misguidance, where a discriminator penalizes an ego vehicle's realistic behavior due to unrealistic interactions among its neighbors. To address this, we propose Decomposed Multi-agent GAIL (DecompGAIL), which explicitly decomposes realism into ego-map and ego-neighbor components, filtering out misleading neighbor: neighbor and neighbor: map interactions. We further introduce a social PPO objective that augments ego rewards with distance-weighted neighborhood rewards, encouraging overall realism across agents. Integrated into a lightweight SMART-based backbone, DecompGAIL achieves state-of-the-art performance on the WOMD Sim Agents 2025 benchmark.
翻译:真实的交通仿真对于自动驾驶系统的开发和城市交通规划至关重要,然而现有的模仿学习方法往往难以对真实的交通行为进行建模。行为克隆方法受协变量偏移问题困扰,而生成对抗模仿学习(GAIL)在多智能体设置中则因其不稳定性而闻名。我们识别了这种不稳定性的一个关键来源:无关交互误导,即判别器由于相邻车辆之间不真实的交互而对主车(ego vehicle)的真实行为进行惩罚。为解决此问题,我们提出了分解式多智能体GAIL(DecompGAIL),该方法将真实性明确分解为“主车-地图”和“主车-邻居”两个组成部分,从而过滤掉具有误导性的“邻居-邻居”和“邻居-地图”交互。我们进一步引入了一个社会PPO目标,该目标通过距离加权的邻域奖励来增强主车奖励,从而鼓励所有智能体行为的整体真实性。在基于轻量级SMART的主干网络中集成后,DecompGAIL在WOMD Sim Agents 2025基准测试中取得了最先进的性能。