Python type annotations enable static type checking, but most code remains untyped because manual annotation is time-consuming and tedious. Past approaches to automatic type inference fall short: static methods struggle with dynamic features and infer overly broad types; AI-based methods are unsound and miss rare types; and dynamic methods impose extreme overheads (up to 270x), lack important language support such as inferring variable types, or produce annotations that cause runtime errors. This paper presents RightTyper, a novel hybrid approach for Python that produces accurate and precise type annotations grounded in actual program behavior. RightTyper grounds inference in types observed during actual program execution and combines these observations with static analysis and name resolution to produce substantially higher-quality type annotations than prior approaches. Through principled, statistically guided adaptive sampling, RightTyper balances runtime overhead with the need to observe sufficient execution behavior to infer high-quality type annotations. We evaluate RightTyper against static, dynamic, and AI-based systems on both synthetic benchmarks and real-world code, and find that it consistently achieves higher semantic similarity to ground-truth and developer-written annotations, respectively, while incurring only approximately 27% runtime overhead.
翻译:暂无翻译