The Cancer Registry of Norway (CRN) collects and processes cancer-related data for patients in Norway. For this, it employs a sociotechnical software system that evolves with changing requirements and medical standards. The current practice is to manually test CRN's system to prevent faults and ensure its dependability. This paper focuses on automatically testing GURI, the CRN's medical rule engine, using a system-level testing tool, EvoMaster, in both its black-box and white-box modes, and a novel CRN-specific EvoMaster-based tool, EvoGURI. We empirically evaluate the tools' effectiveness regarding code coverage, errors found, domain-specific rule coverage, and ability to identify artificial faults ten versions of GURI. Our results show that all the tools achieve similar code coverage and identified a similar number of errors. For rule coverage, EvoGURI and EvoMaster's black-box mode produce test suites that cover the highest number of rules with Pass, Fail, and Warning results. The test suites of EvoGURI and two EvoMaster white-box tools identify the most faults in a mutation testing experiment. Based on our findings, we recommend using EvoGURI in CRN's current practice. Finally, we present key takeaways and outline open research questions for the research community.
翻译:挪威癌症登记处(CRN)负责收集和处理挪威患者的癌症相关数据。为此,该机构采用了一个社会技术软件系统,该系统随着需求变化和医疗标准的更新而持续演进。目前的做法是通过人工测试CRN系统来预防故障并确保其可靠性。本文聚焦于使用系统级测试工具EvoMaster(包括其黑盒与白盒模式)以及一种新型的、基于EvoMaster的CRN专用工具EvoGURI,对CRN的医疗规则引擎GURI进行自动化测试。我们通过实证评估了这些工具在代码覆盖率、错误发现、领域特定规则覆盖率以及识别GURI十个版本中人为植入故障的能力方面的有效性。结果表明,所有工具在代码覆盖率和发现的错误数量上表现相近。在规则覆盖率方面,EvoGURI和EvoMaster的黑盒模式生成的测试套件能够覆盖最多数量的规则(包括通过、失败和警告结果)。在变异测试实验中,EvoGURI和两种EvoMaster白盒工具的测试套件识别出的故障最多。基于我们的发现,我们建议在CRN的当前实践中采用EvoGURI。最后,我们提出了关键结论,并为研究界概述了开放的研究问题。