Scalable multi-agent driving simulation requires behavior models that are both realistic and computationally efficient. We address this by optimizing the behavior model that controls individual traffic participants. To improve efficiency, we adopt an instance-centric scene representation, where each traffic participant and map element is modeled in its own local coordinate frame. This design enables efficient, viewpoint-invariant scene encoding and allows static map tokens to be reused across simulation steps. To model interactions, we employ a query-centric symmetric context encoder with relative positional encodings between local frames. We use Adversarial Inverse Reinforcement Learning to learn the behavior model and propose an adaptive reward transformation that automatically balances robustness and realism during training. Experiments demonstrate that our approach scales efficiently with the number of tokens, significantly reducing training and inference times, while outperforming several agent-centric baselines in terms of positional accuracy and robustness.
翻译:可扩展的多智能体驾驶仿真需要兼具真实性与计算效率的行为模型。我们通过优化控制个体交通参与者的行为模型来解决这一问题。为提升效率,我们采用以实例为中心的场景表示方法,将每个交通参与者与地图元素在其自身局部坐标系中进行建模。该设计实现了高效且视角不变的场景编码,并允许静态地图标记在仿真步骤间重复使用。为建模交互关系,我们采用基于查询的对称上下文编码器,并在局部坐标系间使用相对位置编码。我们利用对抗性逆强化学习训练行为模型,并提出自适应奖励变换机制,在训练过程中自动平衡鲁棒性与真实性。实验表明,该方法能够随标记数量高效扩展,显著降低训练与推理时间,同时在位置精度和鲁棒性方面优于多个以智能体为中心的基线模型。