Security Defect Detection via Code Review: A Study of the OpenStack and Qt Communities

Background: Despite the widespread use of automated security defect detection tools, software projects still contain many security defects that could result in serious damage. Such tools are largely context-insensitive and may not cover all possible scenarios in testing potential issues, which makes them susceptible to missing complex security defects. Hence, thorough detection entails a synergistic cooperation between these tools and human-intensive detection techniques, including code review. Code review is widely recognized as a crucial and effective practice for identifying security defects. Aim: This work aims to empirically investigate security defect detection through code review. Method: To this end, we conducted an empirical study by analyzing code review comments derived from four projects in the OpenStack and Qt communities. Through manually checking 20,995 review comments obtained by keyword-based search, we identified 614 comments as security-related. Results: Our results show that (1) security defects are not prevalently discussed in code review, (2) more than half of the reviewers provided explicit fixing strategies/solutions to help developers fix security defects, (3) developers tend to follow reviewers' suggestions and action the changes, (4) Not worth fixing the defect now and Disagreement between the developer and the reviewer are the main causes for not resolving security defects. Conclusions: Our research results demonstrate that (1) software security practices should combine manual code review with automated detection tools, achieving a more comprehensive coverage to identifying and addressing security defects, and (2) promoting appropriate standardization of practitioners' behaviors during code review remains necessary for enhancing software security.

翻译：背景：尽管自动化安全缺陷检测工具已广泛应用，软件项目中仍存在大量可能导致严重损害的安全缺陷。此类工具大多缺乏上下文感知能力，在测试潜在问题时难以覆盖所有可能场景，因此容易遗漏复杂安全缺陷。为此，彻底检测需要这些工具与人工密集型检测技术（包括代码审查）的协同配合。代码审查被广泛认为是识别安全缺陷的关键且有效的实践方法。目的：本研究旨在通过代码审查对安全缺陷检测进行实证研究。方法：为此，我们通过分析OpenStack和Qt社区四个项目的代码审查评论开展实证研究。通过对基于关键词检索获得的20,995条审查评论进行人工核查，我们识别出614条与安全相关的评论。结果：结果表明：（1）安全缺陷在代码审查中并非普遍讨论的话题；（2）超过半数的审查者提供了明确的修复策略/方案以帮助开发者修复安全缺陷；（3）开发者倾向于采纳审查者的建议并执行修改；（4）“当前不值得修复该缺陷”以及“开发者与审查者意见分歧”是未能解决安全缺陷的主要原因。结论：研究结果表明：（1）软件安全实践应结合人工代码审查与自动化检测工具，从而更全面地覆盖安全缺陷的识别与修复；（2）在代码审查过程中推动从业者行为的适当规范化对增强软件安全性仍具有必要性。