Static call graph (CG) construction often over-approximates call relations, leading to sound, but imprecise results. Recent research has explored machine learning (ML)-based CG pruning as a means to enhance precision by eliminating false edges. However, current methods suffer from a limited evaluation dataset, imbalanced training data, and reduced recall, which affects practical downstream analyses. Prior results were also not compared with advanced static CG construction techniques yet. This study tackles these issues. We introduce the NYXCorpus, a dataset of real-world Java programs with high test coverage and we collect traces from test executions and build a ground truth of dynamic CGs. We leverage these CGs to explore conservative pruning strategies during the training and inference of ML-based CG pruners. We conduct a comparative analysis of static CGs generated using zero control flow analysis (0-CFA) and those produced by a context-sensitive 1-CFA algorithm, evaluating both with and without pruning. We find that CG pruning is a difficult task for real-world Java projects and substantial improvements in the CG precision (+25%) meet reduced recall (-9%). However, our experiments show promising results: even when we favor recall over precision by using an F2 metric in our experiments, we can show that pruned CGs have comparable quality to a context-sensitive 1-CFA analysis while being computationally less demanding. Resulting CGs are much smaller (69%), and substantially faster (3.5x speed-up), with virtually unchanged results in our downstream analysis.
翻译:静态调用图构建常过度近似调用关系,导致结果虽正确但仍不精确。近期研究探索了基于机器学习的调用图剪枝方法,通过消除虚假边来提升精度,但现有方法存在评估数据集有限、训练数据不平衡和召回率下降等问题,影响实际下游分析。此外,先前结果尚未与先进静态调用图构建技术进行比较。本研究针对这些问题展开:我们引入NYXCorpus数据集(包含具有高测试覆盖率的真实Java程序),通过测试执行轨迹构建动态调用图作为基准真相,并利用这些调用图探索机器学习剪枝器训练与推理阶段的保守剪枝策略。通过对比零控制流分析与上下文敏感1-CFA算法生成的静态调用图(含剪枝与不含剪枝),我们发现,真实Java项目的调用图剪枝面临挑战:精度显著提升(+25%)的同时召回率下降(-9%)。然而实验展现了积极结果:即便采用F2指标(优先召回率而非精度),剪枝后的调用图仍能达到与上下文敏感1-CFA分析相当的质量,且计算开销更低。最终调用图体积缩减69%,速度提升3.5倍,下游分析结果几乎无变化。