Testing and evaluating the safety performance of autonomous vehicles (AVs) is essential before the large-scale deployment. Practically, the acceptable cost of testing specific AV model can be restricted within an extremely small limit because of testing cost or time. With existing testing methods, the limitations imposed by strictly restricted testing numbers often result in significant uncertainties or challenges in quantifying testing results. In this paper, we formulate this problem for the first time the "few-shot testing" (FST) problem and propose a systematic FST framework to address this challenge. To alleviate the considerable uncertainty inherent in a small testing scenario set and optimize scenario utilization, we frame the FST problem as an optimization problem and search for a small scenario set based on neighborhood coverage and similarity. By leveraging the prior information on surrogate models (SMs), we dynamically adjust the testing scenario set and the contribution of each scenario to the testing result under the guidance of better generalization ability on AVs. With certain hypotheses on SMs, a theoretical upper bound of testing error is established to verify the sufficiency of testing accuracy within given limited number of tests. The experiments of the cut-in scenario using FST method demonstrate a notable reduction in testing error and variance compared to conventional testing methods, especially for situations with a strict limitation on the number of scenarios.
翻译:在自动驾驶车辆大规模部署之前,对其安全性能进行测试与评估至关重要。由于测试成本或时间的限制,针对特定自动驾驶模型的可接受测试成本通常被限制在极小的范围内。现有测试方法受限于严格受限的测试数量,往往导致测试结果存在显著的不确定性或难以量化。本文首次将这一问题定义为"小样本测试"(FST)问题,并提出了一种系统性的FST框架以应对这一挑战。为缓解小规模测试场景集的固有不确定性并优化场景利用率,我们将FST问题建模为优化问题,基于邻域覆盖与相似性搜索小规模场景集。通过利用代理模型(SMs)的先验信息,我们以提升自动驾驶车辆泛化性能为导向,动态调整测试场景集及每个场景对测试结果的贡献权重。在关于代理模型的特定假设下,建立了测试误差的理论上界,以验证在给定有限测试次数下测试精度的充分性。采用FST方法进行的切入场景实验表明,与传统测试方法相比,测试误差和方差显著降低,尤其是在场景数量严格受限的情况下。