The Cancer Registration Support System (CaReSS), built by the Cancer Registry of Norway (CRN), is a complex real-world socio-technical software system that undergoes continuous evolution in its implementation. Consequently, continuous testing of CaReSS with automated testing tools is needed such that its dependability is always ensured. Towards automated testing of a key software subsystem of CaReSS, i.e., GURI, we present a real-world application of an extension to the open-source tool EvoMaster, which automatically generates test cases with evolutionary algorithms. We named the extension EvoClass, which enhances EvoMaster with a machine learning classifier to reduce the overall testing cost. This is imperative since testing with EvoMaster involves sending many requests to GURI deployed in different environments, including the production environment, whose performance and functionality could potentially be affected by many requests. The machine learning classifier of EvoClass can predict whether a request generated by EvoMaster will be executed successfully or not; if not, the classifier filters out such requests, consequently reducing the number of requests to be executed on GURI. We evaluated EvoClass on ten GURI versions over four years in three environments: development, testing, and production. Results showed that EvoClass can significantly reduce the testing cost of evolving GURI without reducing testing effectiveness (measured as rule coverage) across all three environments, as compared to the default EvoMaster. Overall, EvoClass achieved ~31% of overall cost reduction. Finally, we report our experiences and lessons learned that are equally valuable for researchers and practitioners.
翻译:癌症登记支持系统(CaReSS)由挪威癌症登记处(CRN)构建,是一个复杂的现实世界社会技术软件系统,其实现持续演进。因此,需要使用自动化测试工具对CaReSS进行持续测试,以确保其可靠性。针对CaReSS关键软件子系统(即GURI)的自动化测试,我们展示了开源工具EvoMaster扩展版在现实世界中的应用,该扩展版利用进化算法自动生成测试用例。我们将该扩展命名为EvoClass,它通过机器学习分类器增强EvoMaster,以降低整体测试成本。这一点至关重要,因为使用EvoMaster进行测试涉及向部署在不同环境(包括生产环境)中的GURI发送大量请求,而大量请求可能影响其性能与功能。EvoClass的机器学习分类器能够预测EvoMaster生成的请求是否会被成功执行;若预测失败,分类器会过滤掉此类请求,从而减少需要在GURI上执行的请求数量。我们在开发、测试和生产三种环境中,对跨越四年的十个GURI版本进行了EvoClass评估。结果表明,与默认EvoMaster相比,EvoClass能显著降低演进中GURI的测试成本,且在所有三种环境中均未降低测试有效性(以规则覆盖率衡量)。总体而言,EvoClass实现了约31%的总成本降低。最后,我们报告了实践经验与教训,这些对研究人员和从业者同等有价值。