Existing evaluation paradigms for Autonomous Vehicles (AVs) face critical limitations. Real-world evaluation is often challenging due to safety concerns and a lack of reproducibility, whereas closed-loop simulation can face insufficient realism or high computational costs. Open-loop evaluation, while being efficient and data-driven, relies on metrics that generally overlook compounding errors. In this paper, we propose pseudo-simulation, a novel paradigm that addresses these limitations. Pseudo-simulation operates on real datasets, similar to open-loop evaluation, but augments them with synthetic observations generated prior to evaluation using 3D Gaussian Splatting. Our key idea is to approximate potential future states the AV might encounter by generating a diverse set of observations that vary in position, heading, and speed. Our method then assigns a higher importance to synthetic observations that best match the AV's likely behavior using a novel proximity-based weighting scheme. This enables evaluating error recovery and the mitigation of causal confusion, as in closed-loop benchmarks, without requiring sequential interactive simulation. We show that pseudo-simulation is better correlated with closed-loop simulations ($R^2=0.8$) than the best existing open-loop approach ($R^2=0.7$). We also establish a public leaderboard for the community to benchmark new methodologies with pseudo-simulation. Our code is available at https://github.com/autonomousvision/navsim.
翻译:现有自动驾驶车辆(AV)的评估范式面临关键局限。实际道路评估常因安全隐患和缺乏可重复性而困难重重,闭环仿真则面临真实感不足或计算成本高昂的问题。开环评估虽高效且基于数据驱动,但其依赖的指标通常忽略累积误差。本文提出伪仿真这一新范式,旨在解决上述局限。伪仿真类似开环评估,基于真实数据集运行,但通过评估前使用三维高斯泼溅技术生成的合成观测数据对其进行增强。我们的核心思想是:通过生成在位置、航向和速度上多样化的观测数据集来近似AV可能遭遇的未来状态,随后利用基于近邻的加权方案赋予与AV预期行为最匹配的合成观测数据更高权重。这使得系统能够评估误差恢复能力和因果混淆缓解能力——正如闭环基准测试一样——而无需依赖顺序交互式仿真。我们证明,伪仿真与闭环仿真(R²=0.8)的相关性优于现有最佳开环方法(R²=0.7)。我们同时建立了公开排行榜,供社区使用伪仿真对新方法进行基准测试。代码已开源:https://github.com/autonomousvision/navsim。