Evolving Testing Scenario Generation Method and Intelligence Evaluation Framework for Automated Vehicles

Interaction between the background vehicles (BVs) and automated vehicles (AVs) in scenario-based testing plays a critical role in evaluating the intelligence of the AVs. Current testing scenarios typically employ predefined or scripted BVs, which inadequately reflect the complexity of human-like social behaviors in real-world driving scenarios, and also lack a systematic metric for evaluating the comprehensive intelligence of AVs. Therefore, this paper proposes an evolving scenario generation method that utilizes deep reinforcement learning (DRL) to create human-like BVs for testing and intelligence evaluation of AVs. Firstly, a class of driver models with human-like competitive, cooperative, and mutual driving motivations is designed. Then, utilizing an improved "level-k" training procedure, the three distinct driver models acquire game-based interactive driving policies. And these models are assigned to BVs for generating evolving scenarios in which all BVs can interact continuously and evolve diverse contents. Next, a framework including safety, driving efficiency, and interaction utility are presented to evaluate and quantify the intelligence performance of 3 systems under test (SUTs), indicating the effectiveness of the evolving scenario for intelligence testing. Finally, the complexity and fidelity of the proposed evolving testing scenario are validated. The results demonstrate that the proposed evolving scenario exhibits the highest level of complexity compared to other baseline scenarios and has more than 85% similarity to naturalistic driving data. This highlights the potential of the proposed method to facilitate the development and evaluation of high-level AVs in a realistic and challenging environment.

翻译：在基于场景的测试中，背景车辆与自动驾驶汽车之间的交互对评估自动驾驶汽车智能水平具有关键作用。当前测试场景通常采用预定义或脚本化的背景车辆，无法真实反映现实驾驶场景中类人社交行为的复杂性，且缺乏评估自动驾驶汽车综合智能的系统性指标。为此，本文提出一种基于深度强化学习的演化式场景生成方法，通过构建类人背景车辆实现自动驾驶汽车测试与智能评估。首先，设计具有类人竞争、协作及互惠驾驶动机的驾驶员模型簇；其次，采用改进的"层级-k"训练流程，使三类差异化驾驶员模型掌握基于博弈的交互式驾驶策略，并将其部署至背景车辆以生成演化式场景——在此类场景中所有背景车辆能够持续交互并衍生多样化内容。接着，构建包含安全性、驾驶效率与交互效用的评估框架，量化被测系统的智能表现，验证演化式场景对智能测试的有效性。最后，验证了所提演化式测试场景的复杂性与保真度。结果表明，相较于其他基准场景，所提演化式场景具有最高复杂度，且与自然驾驶数据的相似度超过85%，凸显了该方法在真实且富有挑战的环境中推动高级自动驾驶汽车开发与评估的潜力。