Process-based Indicators of Vulnerability Re-Introducing Code Changes: An Exploratory Case Study

from arxiv, 9 pages, 6 figures; Samiha Shimmi and Nicholas M. Synovic contributed equally to this work (co-first authors); Mona Rahimi and George K. Thiruvathukal contributed equally to this work (co-supervisors)

Software vulnerabilities often persist or re-emerge even after being fixed, revealing the complex interplay between code evolution and socio-technical factors. While source code metrics provide useful indicators of vulnerabilities, software engineering process metrics can uncover patterns that lead to their introduction. Yet few studies have explored whether process metrics can reveal risky development activities over time -- insights that are essential for anticipating and mitigating software vulnerabilities. This work highlights the critical role of process metrics along with code changes in understanding and mitigating vulnerability reintroduction. We move beyond file-level prediction and instead analyze security fixes at the commit level, focusing not only on whether a single fix introduces a vulnerability but also on the longer sequences of changes through which vulnerabilities evolve and re-emerge. Our approach emphasizes that reintroduction is rarely the result of one isolated action, but emerges from cumulative development activities and socio-technical conditions. To support this analysis, we conducted a case study on the ImageMagick project by correlating longitudinal process metrics such as bus factor, issue density, and issue spoilage with vulnerability reintroduction activities, encompassing 76 instances of reintroduced vulnerabilities. Our findings show that reintroductions often align with increased issue spoilage and fluctuating issue density, reflecting short-term inefficiencies in issue management and team responsiveness. These observations provide a foundation for broader studies that combine process and code metrics to predict risky fixes and strengthen software security.

翻译：软件漏洞即使在修复后仍常常持续存在或重新出现，这揭示了代码演化与社会技术因素之间复杂的相互作用。虽然源代码指标提供了漏洞的有用指示，但软件工程过程指标能够揭示导致漏洞引入的模式。然而，很少有研究探讨过程指标是否能够随时间推移揭示有风险的开发活动——这些见解对于预测和缓解软件漏洞至关重要。本研究强调了过程指标与代码变更在理解和缓解漏洞重引入中的关键作用。我们超越了文件级别的预测，转而分析提交级别的安全修复，不仅关注单个修复是否引入了漏洞，还关注漏洞演化与重新出现所经历的更长序列的变更。我们的方法强调，重引入很少是孤立行为的结果，而是源于累积的开发活动与社会技术条件。为支持此分析，我们通过对ImageMagick项目进行案例研究，将纵向过程指标（如巴士因子、问题密度和问题积压）与漏洞重引入活动相关联，涵盖了76个重引入漏洞实例。我们的研究结果表明，重引入通常与问题积压增加和问题密度波动相吻合，这反映了问题管理和团队响应方面的短期低效。这些观察结果为更广泛的研究奠定了基础，这些研究结合过程指标与代码指标来预测有风险的修复并增强软件安全性。