The deployment of autonomous vehicles (AVs) has faced hurdles due to the dominance of rare but critical corner cases within the long-tail distribution of driving scenarios, which negatively affects their overall performance. To address this challenge, adversarial generation methods have emerged as a class of efficient approaches to synthesize safety-critical scenarios for AV testing. However, these generated scenarios are often underutilized for AV training, resulting in the potential for continual AV policy improvement remaining untapped, along with a deficiency in the closed-loop design needed to achieve it. Therefore, we tailor the Stackelberg Driver Model (SDM) to accurately characterize the hierarchical nature of vehicle interaction dynamics, facilitating iterative improvement by engaging background vehicles (BVs) and AV in a sequential game-like interaction paradigm. With AV acting as the leader and BVs as followers, this leader-follower modeling ensures that AV would consistently refine its policy, always taking into account the additional information that BVs play the best response to challenge AV. Extensive experiments have shown that our algorithm exhibits superior performance compared to several baselines especially in higher dimensional scenarios, leading to substantial advancements in AV capabilities while continually generating progressively challenging scenarios.
翻译:自动驾驶汽车(AV)的部署面临挑战,其原因在于驾驶场景的长尾分布中罕见但关键的极端案例占据主导地位,这对其整体性能产生了负面影响。为应对这一问题,对抗性生成方法作为一类高效手段被提出,用于合成自动驾驶测试所需的安全关键场景。然而,这些生成的场景在自动驾驶训练中往往未被充分利用,导致自动驾驶策略持续改进的潜力尚未释放,同时缺乏实现这一目标所需的闭环设计。因此,我们定制了Stackelberg驾驶员模型(SDM),以精确刻画车辆交互动态的层级特性,通过将背景车辆(BVs)和自动驾驶车辆(AV)纳入类似序列博弈的交互范式,促进迭代改进。在该模型中,AV充当领导者,BVs作为追随者,这种主从建模确保AV能够持续优化其策略,始终考虑BVs通过最优反应挑战AV所提供的附加信息。大量实验表明,我们的算法在多个基线方法中展现出优越性能,尤其在高维场景下表现突出,从而在持续生成渐进挑战性场景的同时,显著提升了AV能力。