The deployment of autonomous vehicles (AVs) has faced hurdles due to the dominance of rare but critical corner cases within the long-tail distribution of driving scenarios, which negatively affects their overall performance. To address this challenge, adversarial generation methods have emerged as a class of efficient approaches to synthesize safety-critical scenarios for AV testing. However, these generated scenarios are often underutilized for AV training, resulting in the potential for continual AV policy improvement remaining untapped, along with a deficiency in the closed-loop design needed to achieve it. Therefore, we tailor the Stackelberg Driver Model (SDM) to accurately characterize the hierarchical nature of vehicle interaction dynamics, facilitating iterative improvement by engaging background vehicles (BVs) and AV in a sequential game-like interaction paradigm. With AV acting as the leader and BVs as followers, this leader-follower modeling ensures that AV would consistently refine its policy, always taking into account the additional information that BVs play the best response to challenge AV. Extensive experiments have shown that our algorithm exhibits superior performance compared to several baselines especially in higher dimensional scenarios, leading to substantial advancements in AV capabilities while continually generating progressively challenging scenarios. Code is available at https://github.com/BlueCat-de/SDM.
翻译:自动驾驶汽车(AV)的部署面临挑战,其根源在于驾驶场景长尾分布中罕见但关键的极端案例占据主导地位,这对其整体性能产生了负面影响。为应对这一难题,对抗生成方法应运而生,成为合成AV测试用安全关键场景的一类高效手段。然而,这些生成的场景常常未被充分利用于AV训练,导致AV持续策略改进的潜力尚未得到挖掘,同时缺乏实现该目标所需的闭环设计。因此,我们定制了Stackelberg驾驶员模型(SDM),通过让背景车辆(BV)与AV以序贯博弈般的交互范式共同参与迭代改进,从而精确刻画车辆交互动态的层次特性。在该主从模型中,AV作为领导者,BV作为跟随者,这种主从建模确保了AV能够持续优化其策略,始终兼顾BV为挑战AV而采取的最优反应行为这一附加信息。大量实验表明,我们的算法在多个基线方法中展现出卓越性能,尤其在更高维度场景下表现突出,在持续生成渐进挑战场景的同时,显著提升了AV的能力。代码已开源:https://github.com/BlueCat-de/SDM。