As a new research area, quantum software testing lacks systematic testing benchmarks to assess testing techniques' effectiveness. Recently, some open-source benchmarks and mutation analysis tools have emerged. However, there is insufficient evidence on how various quantum circuit characteristics (e.g., circuit depth, number of quantum gates), algorithms (e.g., Quantum Approximate Optimization Algorithm), and mutation characteristics (e.g., mutation operators) affect the most mutant detection in quantum circuits. Studying such relations is important to systematically design faulty benchmarks with varied attributes (e.g., the difficulty in detecting a seeded fault) to facilitate assessing the cost-effectiveness of quantum software testing techniques efficiently. To this end, we present a large-scale empirical evaluation with more than 700K faulty benchmarks (quantum circuits) generated by mutating 382 real-world quantum circuits. Based on the results, we provide valuable insights for researchers to define systematic quantum mutation analysis techniques. We also provide a tool to recommend mutants to users based on chosen characteristics (e.g., a quantum algorithm type) and the required difficulty of killing mutants. Finally, we also provide faulty benchmarks that can already be used to assess the cost-effectiveness of quantum software testing techniques.
翻译:作为一个新兴研究领域,量子软件测试缺乏系统化的测试基准来评估测试技术的有效性。近年来,一些开源基准测试和突变分析工具相继出现。然而,关于量子电路的各种特征(如电路深度、量子门数量)、算法(如量子近似优化算法)以及突变特征(如突变算子)如何影响量子电路中最难检测的突变,目前仍缺乏充分证据。研究此类关系对于系统性设计具有不同属性(例如检测植入故障的难度)的故障基准至关重要,从而有助于高效评估量子软件测试技术的成本效益。为此,我们通过对382个真实世界量子电路进行突变,生成了超过70万个故障基准(量子电路),并开展了大规模实证评估。基于研究结果,我们为研究人员定义系统化的量子突变分析技术提供了宝贵见解。同时,我们还提供了一款工具,可根据用户选择的特征(如量子算法类型)和所需突变杀灭难度推荐突变体。最后,我们提供了可直接用于评估量子软件测试技术成本效益的故障基准。