Mutation testing can help reduce the risks of releasing faulty software. For such reason, it is a desired practice for the development of embedded software running in safety-critical cyber-physical systems (CPS). Unfortunately, state-of-the-art test data generation techniques for mutation testing of C and C++ software, two typical languages for CPS software, rely on symbolic execution, whose limitations often prevent its application (e.g., it cannot test black-box components). We propose a mutation testing approach that leverages fuzz testing, which has proved effective with C and C++ software. Fuzz testing automatically generates diverse test inputs that exercise program branches in a varied number of ways and, therefore, exercise statements in different program states, thus maximizing the likelihood of killing mutants, our objective. We performed an empirical assessment of our approach with software components used in satellite systems currently in orbit. Our empirical evaluation shows that mutation testing based on fuzz testing kills a significantly higher proportion of live mutants than symbolic execution (i.e., up to an additional 47 percentage points). Further, when symbolic execution cannot be applied, fuzz testing provides significant benefits (i.e., up to 41% mutants killed). Our study is the first one comparing fuzz testing and symbolic execution for mutation testing; our results provide guidance towards the development of fuzz testing tools dedicated to mutation testing.
翻译:变异测试有助于降低发布缺陷软件的风险。因此,对于安全关键型信息物理系统(CPS)中嵌入式软件的开发而言,这是一种理想实践。然而,针对C和C++软件(两种典型的CPS编程语言)的变异测试中,现有最优的测试数据生成技术依赖符号执行,其局限性往往阻碍了实际应用(例如无法对黑箱组件进行测试)。本文提出一种基于模糊测试的变异测试方法,该方法已被证明对C和C++软件有效。模糊测试能自动生成多样化测试输入,以多种方式执行程序分支,从而在不同程序状态下触发语句执行,最大化杀死变异体的可能性——这正是我们的目标。我们利用当前在轨卫星系统中的软件组件进行了实证评估。结果表明,基于模糊测试的变异测试比符号执行能够杀死显著更高比例的存活变异体(最多可额外杀死47个百分点)。此外,当符号执行无法应用时,模糊测试提供了显著优势(最多可杀死41%的变异体)。本研究首次系统比较了模糊测试与符号执行在变异测试中的效果;我们的结果为开发专用变异测试的模糊测试工具提供了指导。