Autonomous Vehicles (AVs) are often tested in simulation to estimate the probability they will violate safety specifications. Two common issues arise when using existing techniques to produce this estimation: If violations occur rarely, simple Monte-Carlo sampling techniques can fail to produce efficient estimates; if simulation horizons are too long, importance sampling techniques (which learn proposal distributions from past simulations) can fail to converge. This paper addresses both issues by interleaving rare-event sampling techniques with online specification monitoring algorithms. We use adaptive multi-level splitting to decompose simulations into partial trajectories, then calculate the distance of those partial trajectories to failure by leveraging robustness metrics from Signal Temporal Logic (STL). By caching those partial robustness metric values, we can efficiently re-use computations across multiple sampling stages. Our experiments on an interstate lane-change scenario show our method is viable for testing simulated AV-pipelines, efficiently estimating failure probabilities for STL specifications based on real traffic rules. We produce better estimates than Monte-Carlo and importance sampling in fewer simulations.
翻译:自动驾驶车辆(AV)常通过仿真测试来评估其违反安全规范的概率。使用现有技术进行此类评估时通常面临两个问题:若违规事件发生概率极低,简单的蒙特卡洛采样方法难以获得有效估计;若仿真时间跨度较长,重要性采样技术(通过历史仿真学习建议分布)可能无法收敛。本文通过将罕见事件采样技术与在线规范监测算法相结合,同时解决了这两个问题。我们采用自适应多级分割方法将仿真分解为部分轨迹,并借助信号时序逻辑(STL)的鲁棒性度量计算这些部分轨迹与失效状态的距离。通过缓存部分鲁棒性度量值,我们能在多个采样阶段高效复用计算。在州际公路变道场景中的实验表明,该方法可有效测试自动驾驶仿真流程,并基于真实交通规则对STL规范的失效概率进行高效估计。相较于蒙特卡洛采样和重要性采样,本方法能以更少的仿真次数获得更优的估计结果。