In light of the growing interest in type inference research for Python, both researchers and practitioners require a standardized process to assess the performance of various type inference techniques. This paper introduces TypeEvalPy, a comprehensive micro-benchmarking framework for evaluating type inference tools. TypeEvalPy contains 154 code snippets with 845 type annotations across 18 categories that target various Python features. The framework manages the execution of containerized tools, transforms inferred types into a standardized format, and produces meaningful metrics for assessment. Through our analysis, we compare the performance of six type inference tools, highlighting their strengths and limitations. Our findings provide a foundation for further research and optimization in the domain of Python type inference.
翻译:鉴于学界对Python类型推断研究的兴趣日益增长,研究人员和从业者需要一种标准化的流程来评估各种类型推断技术的性能。本文介绍了TypeEvalPy,这是一个用于评估类型推断工具的综合性微基准测试框架。TypeEvalPy包含154个代码片段,涵盖18个类别、共845个类型标注,针对各类Python特性设计。该框架管理容器化工具的执行流程,将推断出的类型转换为标准化格式,并生成有意义的评估指标。通过分析,我们比较了六种类型推断工具的性能,揭示了它们的优势与局限性。研究结果为Python类型推断领域的进一步研究和优化奠定了基础。