Specification inference techniques aim at (automatically) inferring a set of assertions that capture the exhibited software behaviour by generating and filtering assertions through dynamic test executions and mutation testing. Although powerful, such techniques are computationally expensive due to a large number of assertions, test cases and mutated versions that need to be executed. To overcome this issue, we demonstrate that a small subset, i.e., 12.95% of the mutants used by mutation testing tools is sufficient for assertion inference, this subset is significantly different, i.e., 71.59% different from the subsuming mutant set that is frequently cited by mutation testing literature, and can be statically approximated through a learning based method. In particular, we propose AIMS, an approach that selects Assertion Inferring Mutants, i.e., a set of mutants that are well-suited for assertion inference, with 0.58 MCC, 0.79 Precision, and 0.49 Recall. We evaluate AIMS on 46 programs and demonstrate that it has comparable inference capabilities with full mutation analysis (misses 12.49% of assertions) while significantly limiting execution cost (runs 46.29 times faster). A comparison with randomly selected sets of mutants, shows the superiority of AIMS by inferring 36% more assertions while requiring approximately equal amount of execution time. We also show that AIMS 's inferring capabilities are almost complete as it infers 96.15% of ground truth assertions, (i.e., a complete set of assertions that were manually constructed) while Random Mutant Selection infers 19.23% of them. More importantly, AIMS enables assertion inference techniques to scale on subjects where full mutation testing is prohibitively expensive and Random Mutant Selection does not lead to any assertion.
翻译:规约推断技术旨在通过动态测试执行和变异测试生成并过滤断言,自动推断出一组捕捉程序行为的断言。尽管此类技术功能强大,但由于需要执行大量断言、测试用例及变异版本,计算成本极其高昂。为解决这一问题,我们证明变异测试工具中仅需使用少量变异体子集(即占总量12.95%)即可满足断言推理需求;该子集与变异测试文献中广泛引用的主导变异体集合存在显著差异(差异率达71.59%),且可通过基于学习的方法进行静态近似。具体而言,我们提出AIMS方法——一种专门选择"断言推理变异体"(即最适配断言推理的变异体集合)的技术,其马修斯相关系数(MCC)达0.58,精确率达0.79,召回率达0.49。我们在46个程序上评估AIMS,结果表明其推理能力与完整变异分析相当(仅遗漏12.49%的断言),同时显著降低执行成本(运行速度快46.29倍)。与随机变异体选择方法相比,AIMS在同等执行时间内可多推断36%的断言,展现出显著优越性。我们还发现AIMS的推理能力近乎完备——它能推断出96.15%的真实断言(即人工构建的完整断言集),而随机变异体选择仅能推断其中19.23%。更重要的是,AIMS使得断言推断技术能够适用于完整变异测试成本过高、且随机变异体选择无法生成任何断言的应用场景。