From simulating galaxy formation to viral transmission in a pandemic, scientific models play a pivotal role in developing scientific theories and supporting government policy decisions that affect us all. Given these critical applications, a poor modelling assumption or bug could have far-reaching consequences. However, scientific models possess several properties that make them notoriously difficult to test, including a complex input space, long execution times, and non-determinism, rendering existing testing techniques impractical. In fields such as epidemiology, where researchers seek answers to challenging causal questions, a statistical methodology known as Causal Inference has addressed similar problems, enabling the inference of causal conclusions from noisy, biased, and sparse data instead of costly experiments. This paper introduces the Causal Testing Framework: a framework that uses Causal Inference techniques to establish causal effects from existing data, enabling users to conduct software testing activities concerning the effect of a change, such as Metamorphic Testing, a posteriori. We present three case studies covering real-world scientific models, demonstrating how the Causal Testing Framework can infer metamorphic test outcomes from reused, confounded test data to provide an efficient solution for testing scientific modelling software.
翻译:从模拟星系形成到流行病中的病毒传播,科学模型在构建科学理论和支持影响全社会的政府政策决策中发挥着关键作用。鉴于这些关键应用,一个错误的建模假设或软件缺陷可能产生深远后果。然而,科学模型具有若干特性使其难以测试,包括复杂的输入空间、长执行时间和非确定性,这使得现有测试技术不切实际。在流行病学等研究人员寻求回答具有挑战性的因果问题的领域,一种称为因果推断的统计方法已成功解决类似问题,能够从含噪声、有偏倚和稀疏的数据中推断因果结论,替代代价高昂的实验。本文提出因果测试框架:该框架利用因果推断技术从现有数据中建立因果效应,使用户能够在事后开展涉及变更影响的软件测试活动(如蜕变测试)。我们通过三个涵盖真实科学模型的案例研究,展示了因果测试框架如何从可重用的混杂测试数据中推断蜕变测试结果,为科学建模软件测试提供高效解决方案。