Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models

Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting a rule-based approach or LLM-only approach is not sufficient to overcome three persistent challenges of code idiomatization including code miss, wrong detection and wrong refactoring. Motivated by the determinism of rules and adaptability of LLMs, we propose a hybrid approach consisting of three modules. We not only write prompts to instruct LLMs to complete tasks, but we also invoke Analytic Rule Interfaces (ARIs) to accomplish tasks. The ARIs are Python code generated by prompting LLMs to generate code. We first construct a knowledge module with three elements including ASTscenario, ASTcomponent and Condition, and prompt LLMs to generate Python code for incorporation into an ARI library for subsequent use. After that, for any syntax-error-free Python code, we invoke ARIs from the ARI library to extract ASTcomponent from the ASTscenario, and then filter out ASTcomponent that does not meet the condition. Finally, we design prompts to instruct LLMs to abstract and idiomatize code, and then invoke ARIs from the ARI library to rewrite non-idiomatic code into the idiomatic code. Next, we conduct a comprehensive evaluation of our approach, RIdiom, and Prompt-LLM on nine established Pythonic idioms in RIdiom. Our approach exhibits superior accuracy, F1-score, and recall, while maintaining precision levels comparable to RIdiom, all of which consistently exceed or come close to 90% for each metric of each idiom. Lastly, we extend our evaluation to encompass four new Pythonic idioms. Our approach consistently outperforms Prompt-LLM, achieving metrics with values consistently exceeding 90% for accuracy, F1-score, precision, and recall.

翻译：Python惯用语法在Python编程社区中备受推崇且被广泛使用。然而，许多Python用户发现运用Python惯用语法具有挑战性。仅采用基于规则的方法或仅依赖大语言模型的方法不足以克服代码惯用化过程中代码遗漏、错误检测和错误重构这三个长期存在的难题。基于规则的确定性和大语言模型的适应性，我们提出了一种包含三个模块的混合方法。我们不仅编写提示词指导大语言模型完成任务，还调用分析规则接口来完成工作。这些分析规则接口是通过提示大语言模型生成Python代码而产生的。我们首先构建一个包含抽象语法树场景、抽象语法树组件和条件三个要素的知识模块，并提示大语言模型生成Python代码以整合到分析规则接口库中供后续使用。随后，针对任何语法正确的Python代码，我们从分析规则接口库中调用分析规则接口从抽象语法树场景中提取抽象语法树组件，然后过滤掉不满足条件的抽象语法树组件。最后，我们设计提示词指导大语言模型对代码进行抽象和惯用化，再从分析规则接口库中调用分析规则接口将非惯用代码重写为惯用代码。接着，我们在RIdiom框架中针对九种既定的Python惯用语法，对我们的方法、RIdiom和提示词大语言模型进行了全面评估。我们的方法在准确率、F1分数和召回率方面表现更优，同时保持与RIdiom相当的精确率，各项指标均持续超过或接近90%。最后，我们将评估扩展到四种新的Python惯用语法。我们的方法始终优于提示词大语言模型，在准确率、F1分数、精确率和召回率等指标上均持续超过90%。