The public, regulators, and domain experts alike seek to understand the effect of deployed SAE level 4 automated driving system (ADS) technologies on safety. The recent expansion of ADS technology deployments is paving the way for early stage safety impact evaluations, whereby the observational data from both an ADS and a representative benchmark fleet are compared to quantify safety performance. In January 2024, a working group of experts across academia, insurance, and industry came together in Washington, DC to discuss the current and future challenges in performing such evaluations. A subset of this working group then met, virtually, on multiple occasions to produce this paper. This paper presents the RAVE (Retrospective Automated Vehicle Evaluation) checklist, a set of fifteen recommendations for performing and evaluating retrospective ADS performance comparisons. The recommendations are centered around the concepts of (1) quality and validity, (2) transparency, and (3) interpretation. Over time, it is anticipated there will be a large and varied body of work evaluating the observed performance of these ADS fleets. Establishing and promoting good scientific practices benefits the work of stakeholders, many of whom may not be subject matter experts. This working group's intentions are to: i) strengthen individual research studies and ii) make the at-large community more informed on how to evaluate this collective body of work.
翻译:公众、监管机构和领域专家都希望了解已部署的SAE L4级自动驾驶系统(ADS)技术对安全的影响。近期ADS技术部署的扩展为早期安全影响评估铺平了道路,此类评估通过比较ADS与代表性基准车队的观测数据来量化安全性能。2024年1月,来自学术界、保险业和工业界的专家工作组在华盛顿特区召开会议,讨论了当前及未来执行此类评估所面临的挑战。该工作组的一个子集随后多次举行线上会议,最终形成本论文。本文提出了RAVE(回顾性自动驾驶车辆评估)检查清单,这是一套包含十五项建议的清单,用于执行和评估回顾性ADS性能比较。这些建议围绕以下核心概念展开:(1)质量与效度,(2)透明度,以及(3)结果解读。随着时间的推移,预计将出现大量评估这些ADS车队观测性能的多样化研究成果。建立并推广良好的科学实践有益于各利益相关方的工作,其中许多人可能并非领域专家。本工作组的意图在于:i)加强个体研究,以及ii)使更广泛的社群能更深入地了解如何评估这一系列研究成果。