As Python is increasingly being adopted for large and complex programs, the importance of static analysis for Python (such as type inference) grows. Unfortunately, static analysis for Python remains a challenging task due to its dynamic language features and its abundant external libraries. To help fill this gap, this paper presents PoTo, an Andersen-style context-insensitive and flow-insensitive points-to analysis for Python. PoTo addresses Python-specific challenges and works for large programs via a novel hybrid evaluation, integrating traditional static points-to analysis with concrete evaluation in the Python interpreter for external library calls. Next, this paper presents PoTo+, a static type inference for Python built on the points-to analysis. We evaluate PoTo+ and compare it to two state-of-the-art Python type inference techniques: (1) the static rule-based Pytype and (2) the deep-learning based DLInfer. Our results show that PoTo+ outperforms both Pytype and DLInfer on existing Python packages.
翻译:随着Python越来越多地被用于大型复杂程序,Python静态分析(如类型推断)的重要性日益增长。然而,由于Python的动态语言特性及其丰富的外部库,其静态分析仍是一项具有挑战性的任务。为填补这一空白,本文提出PoTo——一种针对Python的安德森风格上下文不敏感流不敏感指向分析。PoTo通过一种新颖的混合评估方法,将传统静态指向分析与Python解释器中针对外部库调用的具体评估相结合,解决了Python特有的挑战并适用于大型程序。随后,本文提出基于该指向分析的Python静态类型推断工具PoTo+。我们将PoTo+与两种最先进的Python类型推断技术进行对比:(1)基于静态规则的Pytype和(2)基于深度学习的DLInfer。结果表明,在现有Python包上,PoTo+的表现优于Pytype和DLInfer。