ROMAN: Reward-Orchestrated Multi-Head Attention Network for Autonomous Driving System Testing

Automated Driving System (ADS) acts as the brain of autonomous vehicles, responsible for their safety and efficiency. Safe deployment requires thorough testing in diverse real-world scenarios and compliance with traffic laws like speed limits, signal obedience, and right-of-way rules. Violations like running red lights or speeding pose severe safety risks. However, current testing approaches face significant challenges: limited ability to generate complex and high-risk law-breaking scenarios, and failing to account for complex interactions involving multiple vehicles and critical situations. To address these challenges, we propose ROMAN, a novel scenario generation approach for ADS testing that combines a multi-head attention network with a traffic law weighting mechanism. ROMAN is designed to generate high-risk violation scenarios to enable more thorough and targeted ADS evaluation. The multi-head attention mechanism models interactions among vehicles, traffic signals, and other factors. The traffic law weighting mechanism implements a workflow that leverages an LLM-based risk weighting module to evaluate violations based on the two dimensions of severity and occurrence. We have evaluated ROMAN by testing the Baidu Apollo ADS within the CARLA simulation platform and conducting extensive experiments to measure its performance. Experimental results demonstrate that ROMAN surpassed state-of-the-art tools ABLE and LawBreaker by achieving 7.91% higher average violation count than ABLE and 55.96% higher than LawBreaker, while also maintaining greater scenario diversity. In addition, only ROMAN successfully generated violation scenarios for every clause of the input traffic laws, enabling it to identify more high-risk violations than existing approaches.

翻译：自动驾驶系统（ADS）作为自动驾驶车辆的核心，负责保障其安全性与运行效率。为确保安全部署，需在多样化的真实场景中进行全面测试，并确保其遵守限速、信号灯遵守、路权规则等交通法规。闯红灯、超速等违规行为会带来严重的安全风险。然而，当前测试方法面临两大挑战：一是生成复杂、高风险违法场景的能力有限；二是未能充分考虑多车参与及危急情况下的复杂交互。为应对这些挑战，本文提出ROMAN——一种结合多头注意力网络与交通法规加权机制的新型ADS测试场景生成方法。ROMAN旨在生成高风险违规场景，以实现更全面、更具针对性的ADS评估。其中，多头注意力机制用于建模车辆、交通信号及其他要素间的交互；交通法规加权机制则通过一个工作流程实现，该流程利用基于大语言模型的风险加权模块，从严重性与发生频率两个维度评估违规行为。我们在CARLA仿真平台中对百度Apollo ADS进行了测试，并通过大量实验评估ROMAN的性能。实验结果表明，ROMAN在平均违规数量上较前沿工具ABLE提升7.91%，较LawBreaker提升55.96%，同时保持了更高的场景多样性。此外，唯有ROMAN能针对输入交通法规的每一条款成功生成违规场景，从而比现有方法识别出更多高风险违规行为。