In light of the growing interest in type inference research for Python, both researchers and practitioners require a standardized process to assess the performance of various type inference techniques. This paper introduces TypeEvalPy, a comprehensive micro-benchmarking framework for evaluating type inference tools. TypeEvalPy contains 154 code snippets with 845 type annotations across 18 categories that target various Python features. The framework manages the execution of containerized tools, transforms inferred types into a standardized format, and produces meaningful metrics for assessment. Through our analysis, we compare the performance of six type inference tools, highlighting their strengths and limitations. Our findings provide a foundation for further research and optimization in the domain of Python type inference.
翻译:随着Python类型推断研究兴趣日益增长,研究者和实践者需要一套标准化流程来评估各类类型推断技术的性能。本文介绍TypeEvalPy——一个用于评估类型推断工具的综合性微基准测试框架。TypeEvalPy包含154个代码片段,涵盖18个类别的845个类型标注,这些类别针对Python的各种特性。该框架管理容器化工具的执行,将推断出的类型转换为标准化格式,并生成有意义的评估指标。通过我们的分析,我们比较了六种类型推断工具的性能,突出了它们的优势与局限性。研究结果为Python类型推断领域的进一步研究与优化奠定了基础。