Call graph construction is the foundation of inter-procedural static analysis. PYCG is the state-of-the-art approach for constructing call graphs for Python programs. Unfortunately, PyCG does not scale to large programs when adapted to whole-program analysis where application and dependent libraries are both analyzed. Moreover, PyCG is flow-insensitive and does not fully support Python's features, hindering its accuracy. To overcome these drawbacks, we propose a scalable and precise approach for constructing application-centered call graphs for Python programs, and implement it as a prototype tool JARVIS. JARVIS maintains a type graph (i.e., type relations of program identifiers) for each function in a program to allow type inference. Taking one function as an input, JARVIS generates the call graph on-the-fly, where flow-sensitive intra-procedural analysis and inter-procedural analysis are conducted in turn and strong updates are conducted. Our evaluation on a micro-benchmark of 135 small Python programs and a macro-benchmark of 6 real-world Python applications has demonstrated that JARVIS can significantly improve PYCG by at least 67% faster in time, 84% higher in precision, and at least 20% higher in recall.
翻译:调用图构建是过程间静态分析的基础。PYCG是当前为Python程序构建调用图的最先进方法。然而,当PYCG被应用于同时分析应用程序及其依赖库的全程序分析场景时,其无法扩展至大型程序。此外,PYCG是流不敏感的,且未完全支持Python的特性,这限制了其准确性。为克服这些缺陷,我们提出了一种可扩展且精确的方法,用于构建面向Python程序的以应用为中心的调用图,并将其实现为原型工具JARVIS。JARVIS为程序中的每个函数维护一个类型图(即程序标识符的类型关系),以实现类型推断。以单个函数为输入,JARVIS即时生成调用图,在此过程中,流敏感的局部过程分析与过程间分析交替进行,并执行强更新。我们在包含135个小型Python程序的微基准测试和6个真实世界Python应用构成的宏基准测试上的评估表明,JARVIS能够在运行时间上比PYCG至少快67%,精度提升至少84%,召回率提升至少20%。