This paper presents an extended version of Deeper, a search-based simulation-integrated test solution that generates failure-revealing test scenarios for testing a deep neural network-based lane-keeping system. In the newly proposed version, we utilize a new set of bio-inspired search algorithms, genetic algorithm (GA), $({\mu}+{\lambda})$ and $({\mu},{\lambda})$ evolution strategies (ES), and particle swarm optimization (PSO), that leverage a quality population seed and domain-specific cross-over and mutation operations tailored for the presentation model used for modeling the test scenarios. In order to demonstrate the capabilities of the new test generators within Deeper, we carry out an empirical evaluation and comparison with regard to the results of five participating tools in the cyber-physical systems testing competition at SBST 2021. Our evaluation shows the newly proposed test generators in Deeper not only represent a considerable improvement on the previous version but also prove to be effective and efficient in provoking a considerable number of diverse failure-revealing test scenarios for testing an ML-driven lane-keeping system. They can trigger several failures while promoting test scenario diversity, under a limited test time budget, high target failure severity, and strict speed limit constraints.
翻译:本文提出了Deeper的扩展版本,这是一种基于搜索的仿真集成测试解决方案,用于生成揭示深度神经网络车道保持系统故障的测试场景。在新版本中,我们利用一组新的生物启发搜索算法——遗传算法(GA)、$({\mu}+{\lambda})$和$({\mu},{\lambda})$进化策略(ES)以及粒子群优化(PSO),这些算法借助高质量种群种子和针对测试场景建模所用的表示模型定制的领域特定交叉与变异操作。为了展示Deeper中新测试生成器的能力,我们进行了实证评估,并与SBST 2021网络物理系统测试竞赛中五个参与工具的结果进行了比较。评估表明,Deeper中新提出的测试生成器不仅相比之前版本有显著改进,而且在测试机器学习驱动的车道保持系统时,能够有效且高效地触发大量多样化的故障揭示测试场景。它们能够在有限的测试时间预算、高目标故障严重性和严格的速度限制约束下,触发多个故障并促进测试场景的多样性。