Context: Code coverage is widely used as a software quality assurance measure. However, its effect, and specifically the advisable dose, are disputed in both the research and engineering communities. Prior work reports only correlational associations, leaving results vulnerable to confounding factors. Objective: We aim to quantify the causal effect of code coverage (exposure) on bug introduction (outcome) in the context of mature JavaScript and TypeScript open source projects, addressing both the overall effect and its variance across coverage levels. Method: We construct a causal directed acyclic graph to identify confounders within the software engineering process, modeling key variables from the source code, issue- and review systems, and continuous integration. Using generalized propensity score adjustment, we will apply doubly robust regression-based causal inference for continuous exposure to a novel dataset of bug-introducing and non-bug-introducing changes. We estimate the average treatment effect and dose-response relationship to examine potential non-linear patterns (e.g., thresholds or diminishing returns) within the projects of our dataset.
翻译:背景:代码覆盖率被广泛用作软件质量保证的度量指标。然而,其效果,特别是建议的覆盖程度,在研究和工程界均存在争议。先前工作仅报告了相关性关联,使得结果易受混杂因素影响。目标:本研究旨在成熟JavaScript与TypeScript开源项目的背景下,量化代码覆盖率(暴露)对缺陷引入(结果)的因果效应,同时探究整体效应及其在不同覆盖率水平间的变异。方法:我们构建因果有向无环图以识别软件工程过程中的混杂因素,对源代码、问题与评审系统以及持续集成中的关键变量进行建模。利用广义倾向得分调整,我们将基于双重稳健回归的因果推断方法应用于连续暴露的新型数据集,该数据集包含缺陷引入与非缺陷引入的代码变更。通过估计平均处理效应与剂量-反应关系,我们探究数据集中项目内潜在的非线性模式(例如阈值效应或收益递减现象)。