The deployment of autonomous vehicles (AVs) has faced hurdles due to the dominance of rare but critical corner cases within the long-tail distribution of driving scenarios, which negatively affects their overall performance. To address this challenge, adversarial generation methods have emerged as a class of efficient approaches to synthesize safety-critical scenarios for AV testing. However, these generated scenarios are often underutilized for AV training, resulting in the potential for continual AV policy improvement remaining untapped, along with a deficiency in the closed-loop design needed to achieve it. Therefore, we tailor the Stackelberg Driver Model (SDM) to accurately characterize the hierarchical nature of vehicle interaction dynamics, facilitating iterative improvement by engaging background vehicles (BVs) and AV in a sequential game-like interaction paradigm. With AV acting as the leader and BVs as followers, this leader-follower modeling ensures that AV would consistently refine its policy, always taking into account the additional information that BVs play the best response to challenge AV. Extensive experiments have shown that our algorithm exhibits superior performance compared to several baselines especially in higher dimensional scenarios, leading to substantial advancements in AV capabilities while continually generating progressively challenging scenarios. Code is available at https://github.com/BlueCat-de/SDM.
翻译:自动驾驶车辆(AV)的部署面临挑战,其根源在于驾驶场景长尾分布中罕见但关键边界案例的主导地位对整体性能造成负面影响。针对这一难题,对抗性生成方法作为一类高效合成测试AV所需安全关键场景的技术应运而生。然而,这些生成场景在AV训练中常未被充分利用,导致AV策略持续改进的潜力未得到开发,且缺乏实现该目标所需的闭环设计。为此,我们定制了Stackelberg驾驶员模型(SDM),通过将背景车辆(BV)与AV置于序列博弈式交互范式中,精准刻画车辆交互动态的层次化本质,促进迭代式改进。通过设定AV为领导者、BV为跟随者,这种领导者-跟随者建模确保AV能持续优化其策略,始终将BV为挑战AV而采取最佳响应行为的额外信息纳入考量。大量实验表明,我们的算法在多个基线上展现出卓越性能(尤其在高维场景中),不仅持续生成难度递增的挑战性场景,更推动了AV能力的显著提升。代码已开源至https://github.com/BlueCat-de/SDM。