Search-based software testing (SBST) typically relies on fitness functions to guide the search exploration toward software failures. There are two main techniques to define fitness functions: (a) automated fitness function computation from the specification of the system requirements, and (b) manual fitness function design. Both techniques have advantages. The former uses information from the system requirements to guide the search toward portions of the input domain more likely to contain failures. The latter uses the engineers' domain knowledge. We propose ATheNA, a novel SBST framework that combines fitness functions automatically generated from requirements specifications and those manually defined by engineers. We design and implement ATheNA-S, an instance of ATheNA that targets Simulink models. We evaluate ATheNA-S by considering a large set of models from different domains. Our results show that ATheNA-S generates more failure-revealing test cases than existing baseline tools and that the difference between the runtime performance of ATheNA-S and the baseline tools is not statistically significant. We also assess whether ATheNA-S could generate failure-revealing test cases when applied to two representative case studies: one from the automotive domain and one from the medical domain. Our results show that ATheNA-S successfully revealed a requirement violation in our case studies.
翻译:搜索驱动软件测试(SBST)通常依赖适应度函数引导搜索过程以发现软件缺陷。定义适应度函数主要有两种技术:(a)根据系统需求规格说明自动计算适应度函数,以及(b)人工设计适应度函数。两种技术各有优势:前者利用系统需求信息引导搜索更可能包含缺陷的输入域区域,后者则借助工程师的领域知识。我们提出ATheNA这一新型SBST框架,该框架结合了从需求规格自动生成的适应度函数与工程师人工定义的适应度函数。我们设计并实现了ATheNA-S——针对Simulink模型的ATheNA实例。通过考虑来自不同领域的大量模型,我们对ATheNA-S进行了评估。结果表明,ATheNA-S能比现有基准工具生成更多暴露缺陷的测试用例,且ATheNA-S与基准工具在运行时性能上的差异不具有统计显著性。我们还评估了ATheNA-S在两项代表性案例研究(汽车领域与医疗领域)中生成暴露缺陷测试用例的能力。结果显示,ATheNA-S成功揭示了案例研究中的一项需求违反。