Ensuring the safety and robustness of autonomous driving systems (ADSs) is imperative. One of the crucial methods towards this assurance is the meticulous construction and execution of test scenarios, a task often regarded as tedious and laborious. In response to this challenge, this paper introduces TARGET, an end-to-end framework designed for the automatic generation of test scenarios grounded in established traffic rules. Specifically, we design a domain-specific language (DSL) with concise and expressive syntax for scenario descriptions. To handle the natural language complexity and ambiguity in traffic rule descriptions, we leverage a large language model to automatically extract knowledge from traffic rules and convert the traffic rule descriptions to DSL representations. Based on these representations, TARGET synthesizes executable test scenario scripts to render the testing scenarios in a simulator. Comprehensive evaluations of the framework were conducted on four distinct ADSs, yielding a total of 217 test scenarios spread across eight diverse maps. These scenarios identify approximately 700 rule violations, collisions, and other significant issues, including navigation failures. Moreover, for each detected anomaly, TARGET provides detailed scenario recordings and log reports, significantly easing the process of troubleshooting and root cause analysis. Two of these causes have been confirmed by the ADS developers; one is corroborated by an existing bug report from the ADS, and the other one is attributed to the limited functionality of the ADS.
翻译:摘要:确保自动驾驶系统(ADS)的安全性和鲁棒性至关重要。实现这一保障的关键方法之一是精心构建和执行测试场景,而这项工作通常被视为繁琐且耗时。为应对这一挑战,本文介绍了TARGET——一个基于既定交通规则自动生成测试场景的端到端框架。具体而言,我们设计了一种具有简洁且富有表现力语法的领域特定语言(DSL)用于场景描述。为处理交通规则描述中的自然语言复杂性和歧义性,我们利用大型语言模型自动提取交通规则中的知识,并将交通规则描述转换为DSL表示。基于这些表示,TARGET合成可执行的测试场景脚本,以在模拟器中呈现测试场景。我们在四个不同的ADS上对该框架进行了全面评估,在八个多样化地图中生成了总计217个测试场景。这些场景识别出约700条规则违反、碰撞及其他重大问题,包括导航失败。此外,针对每个检测到的异常,TARGET提供详细场景记录和日志报告,显著简化了故障排查与根本原因分析过程。其中两个原因已获ADS开发者确认:一个由ADS现有错误报告佐证,另一个归因于ADS的功能局限性。