This paper studies the class of scenario-based safety testing algorithms in the black-box safety testing configuration. For algorithms sharing the same state-action set coverage with different sampling distributions, it is commonly believed that prioritizing the exploration of high-risk state-actions leads to a better sampling efficiency. Our proposal disputes the above intuition by introducing an impossibility theorem that provably shows all safety testing algorithms of the aforementioned difference perform equally well with the same expected sampling efficiency. Moreover, for testing algorithms covering different sets of state-actions, the sampling efficiency criterion is no longer applicable as different algorithms do not necessarily converge to the same termination condition. We then propose a testing aggressiveness definition based on the almost safe set concept along with an unbiased and efficient algorithm that compares the aggressiveness between testing algorithms. Empirical observations from the safety testing of bipedal locomotion controllers and vehicle decision-making modules are also presented to support the proposed theoretical implications and methodologies.
翻译:本文研究了黑盒安全性测试配置下的情景类安全性测试算法。对于在状态-动作集覆盖范围相同但采样分布不同的算法,普遍认为优先探索高风险状态-动作能获得更高采样效率。我们的研究通过引入一个不可能性定理反驳了这一直觉,该定理证明性地表明,上述差异下的所有安全性测试算法具有等价的期望采样效率。此外,对于覆盖不同状态-动作集的测试算法,采样效率准则不再适用,因为不同算法未必收敛到相同的终止条件。我们随后基于几乎安全集概念提出了测试攻击性定义,并给出了一个无偏且高效的算法以比较不同测试算法的攻击性。文中还展示了双足运动控制器和车辆决策模块安全性测试的实证观测结果,以支持所提出的理论内涵与方法论。