Detecting Build Dependency Errors in Incremental Builds

Incremental and parallel builds performed by build tools such as Make are the heart of modern C/C++ software projects. Their correct and efficient execution depends on build scripts. However, build scripts are prone to errors. The most prevalent errors are missing dependencies (MDs) and redundant dependencies (RDs). The state-of-the-art methods for detecting these errors rely on clean builds (i.e., full builds of a subset of software configurations in a clean environment), which is costly and takes up to multiple hours for large-scale projects. To address these challenges, we propose a novel approach called EChecker to detect build dependency errors in the context of incremental builds. The core idea of EChecker is to automatically update actual build dependencies by inferring them from C/C++ pre-processor directives and Makefile changes from new commits, which avoids clean builds when possible. EChecker achieves higher efficiency than the methods that rely on clean builds while maintaining effectiveness. We selected 12 representative projects, with their sizes ranging from small to large, with 240 commits (20 commits for each project), based on which we evaluated the effectiveness and efficiency of EChecker. We compared the evaluation results with a state-of-the-art build dependency error detection tool. The evaluation shows that the F-1 score of EChecker improved by 0.18 over the state-of-the-art method. EChecker increases the build dependency error detection efficiency by an average of 85.14 times (with the median at 16.30 times). The results demonstrate that EChecker can support practitioners in detecting build dependency errors efficiently.

翻译：由Make等构建工具执行的增量构建与并行构建是现代C/C++软件项目的核心。其正确高效的执行依赖于构建脚本，然而构建脚本容易出错。最常见的错误是缺失依赖（MD）和冗余依赖（RD）。现有检测这些错误的最优方法依赖于干净构建（即在干净环境中对软件配置子集进行完整构建），这种方法成本高昂，对于大型项目可能耗时数小时。为解决这一挑战，我们提出一种名为EChecker的新方法，用于在增量构建场景中检测构建依赖错误。EChecker的核心思想是通过从新提交的C/C++预处理指令与Makefile变更中推断实际构建依赖，自动更新这些依赖，从而尽可能避免干净构建。EChecker在保持有效性的同时，比依赖干净构建的方法具有更高效率。我们选取了12个代表性项目（规模从小型到大型不等，每个项目包含240次提交中的20次提交）来评估EChecker的有效性与效率，并将评估结果与现有最优的构建依赖错误检测工具进行对比。评估表明，EChecker的F-1分数相比现有最优方法提升了0.18；其构建依赖错误检测效率平均提高85.14倍（中位数为16.30倍）。结果证明，EChecker能够有效支持实践者高效检测构建依赖错误。