Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we enhance simulation-based testing by introducing the notion of digital siblings, a multi-simulator approach that tests a given autonomous vehicle on multiple general-purpose simulators built with different technologies, that operate collectively as an ensemble in the testing process. We exemplify our approach on a case study focused on testing the lane-keeping component of an autonomous vehicle. We use two open-source simulators as digital siblings, and we empirically compare such a multi-simulator approach against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our approach requires generating and running test cases for each individual simulator, in the form of sequences of road points. Then, test cases are migrated between simulators, using feature maps to characterize the exercised driving conditions. Finally, the joint predicted failure probability is computed, and a failure is reported only in cases of agreement among the siblings. Our empirical evaluation shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss the findings of our case study and detail how our approach can help researchers interested in automated testing of autonomous driving software.
翻译:摘要:基于仿真的测试是确保自动驾驶软件可靠性的关键环节。实践中,当企业依赖第三方通用仿真器进行内部或外包测试时,测试结果向真实自动驾驶车辆的泛化能力面临严峻挑战。本文通过引入"数字孪生体"概念,提出一种多仿真器协作方法——利用基于不同技术构建的多个通用仿真器,以集成学习方式协同测试同一自动驾驶车辆。我们以车辆车道保持模块的测试为例展开案例研究:采用两个开源仿真器作为数字孪生体,通过大规模测试用例,将多仿真器方法的测试结果与物理缩比自动驾驶车辆的数字孪生结果进行实证对比。该方法首先为每个仿真器生成并执行由道路点序列构成的测试用例,继而利用特征映射表征驾驶工况,实现测试用例在仿真器间的迁移;最终计算联合失效概率,仅当各孪生体判定一致时报告失效。实证评估表明,数字孪生体集成预测器的失效预测能力优于单个仿真器对数字孪生的预测。本文通过案例研究成果,详细阐述了该方法对自动驾驶软件自动化测试研究领域的实用价值。