Predicting program behavior without execution is a crucial and challenging task in software engineering. Traditional models often struggle to capture the dynamic dependencies and interactions within code. This paper introduces a novel machine learning-based framework called CodeFlow, designed to predict code coverage and detect runtime errors through Dynamic Dependencies Learning. By utilizing control flow graphs (CFGs), CodeFlow represents all possible execution paths and the relationships between different statements, providing a comprehensive understanding of program behavior. CodeFlow constructs CFGs to depict execution paths and learns vector representations for CFG nodes, capturing static control-flow dependencies. Additionally, it learns dynamic dependencies through execution traces, which reflect the impacts among statements during execution. This approach enables accurate prediction of code coverage and effective identification of runtime errors. Empirical evaluations demonstrate significant improvements in code coverage prediction accuracy and effective localization of runtime errors, outperforming existing models.
翻译:无需执行即可预测程序行为是软件工程中一项关键且具有挑战性的任务。传统模型往往难以捕捉代码内部的动态依赖与交互。本文提出了一种名为CodeFlow的新型机器学习框架,旨在通过动态依赖学习来预测代码覆盖率并检测运行时错误。通过利用控制流图(CFG),CodeFlow能够表示所有可能的执行路径以及不同语句之间的关系,从而提供对程序行为的全面理解。CodeFlow通过构建CFG来描绘执行路径,并学习CFG节点的向量表示,以捕捉静态控制流依赖。此外,它通过执行轨迹学习动态依赖,这些轨迹反映了执行过程中语句之间的相互影响。该方法能够准确预测代码覆盖率并有效识别运行时错误。实证评估表明,该框架在代码覆盖率预测准确性和运行时错误有效定位方面均有显著提升,性能优于现有模型。