In safety-critical systems (e.g., autonomous vehicles and robots), Deep Neural Networks (DNNs) are becoming a key component for computer vision tasks, particularly semantic segmentation. Further, since the DNN behavior cannot be assessed through code inspection and analysis, test automation has become an essential activity to gain confidence in the reliability of DNNs. Unfortunately, state-of-the-art automated testing solutions largely rely on simulators, whose fidelity is always imperfect, thus affecting the validity of test results. To address such limitations, we propose to combine meta-heuristic search, used to explore the input space using simulators, with Generative Adversarial Networks (GANs), to transform the data generated by simulators into realistic input images. Such images can be used both to assess the DNN performance and to retrain the DNN more effectively. We applied our approach to a state-of-the-art DNN performing semantic segmentation and demonstrated that it outperforms a state-of-the-art GAN-based testing solution and several baselines. Specifically, it leads to the largest number of diverse images leading to the worst DNN performance. Further, the images generated with our approach, lead to the highest improvement in DNN performance when used for retraining. In conclusion, we suggest to always integrate GAN components when performing search-driven, simulator-based testing.
翻译:在安全关键系统(如自动驾驶车辆和机器人)中,深度神经网络(DNN)正成为计算机视觉任务(特别是语义分割)的关键组成部分。此外,由于DNN的行为无法通过代码检查和分析来评估,测试自动化已成为确保DNN可靠性的重要活动。然而,目前最先进的自动化测试解决方案主要依赖于仿真器,其保真度总是不完美的,从而影响了测试结果的有效性。为解决这一局限,我们提出将元启发式搜索(用于通过仿真器探索输入空间)与生成对抗网络(GAN)相结合,将仿真器生成的数据转换为逼真的输入图像。这些图像既可用于评估DNN性能,也可用于更有效地对DNN进行再训练。我们将该方法应用于执行语义分割的最先进DNN,并证明其性能优于最先进的基于GAN的测试解决方案及多个基线方法。具体而言,该方法生成了最多样化的图像,导致DNN性能表现最差。此外,使用我们方法生成的图像在再训练时带来了最高的DNN性能提升。综上所述,我们建议在执行基于搜索和仿真的测试时始终集成GAN组件。