Excessively long methods that encapsulate multiple responsibilities within a single method are challenging to comprehend, debug, reuse, and maintain. The solution to this problem, a hallmark refactoring called Extract Method, consists of two phases: (i) choosing the statements to extract and (ii) applying the mechanics to perform this refactoring. While the application part has been a staple feature of all modern IDEs, they leave it up to developers to choose the statements to extract. Choosing which statements are profitable to extract has been the subject of many research tools that employ hard-coded rules to optimize software quality metrics. Despite steady improvements, these tools often fail to generate refactorings that align with developers' preferences and acceptance criteria. In this paper, we introduce EM-Assist, a tool that augments the refactoring capabilities of IDEs with the power of LLMs to perform Extract Method refactoring. We empirically evaluated EM-Assist on a diverse, publicly available corpus that other researchers used in the past. The results show that EM-Assist outperforms previous state-of-the-art tools: at 1% tolerance, EM-Assist suggests the correct refactoring among its top-5 suggestions 60.6% of the time, compared to 54.2% reported by existing ML models, and 52.2% reported by existing static analysis tools. When we replicated 2,849 actual Extract Method instances from open-source projects, EM-Assist's recall rate was 42.1% compared to 6.5% for its peers. Furthermore, we conducted warehouse surveys with 20 industrial developers and suggested refactorings on their recent commits. 81.3% of the respondents agreed with the recommendations provided by EM-Assist. This shows the usefulness of our approach and ushers us into a new era of refactoring when LLMs.
翻译:过长的代码方法因封装了多重职责而难以理解、调试、复用和维护。解决这一问题的经典重构手法“提取方法”包含两个阶段:(i)选择待提取的语句块,以及(ii)应用重构机制的机械步骤。尽管现代集成开发环境(IDE)已普遍支持第二阶段的机械操作,但语句块的选择仍需开发者自行完成。如何确定具有重构收益的语句块,一直是众多研究工具的核心课题——这些工具通过预定义规则优化软件质量指标。尽管此类方法持续改进,其生成的重构方案常与开发者的偏好和验收标准存在偏差。本文提出EM-Assist工具,该工具将IDE的重构能力与大语言模型(LLM)相结合,实现了提取方法重构的智能化增强。我们在其他研究者过去使用的多样化公开语料库上进行了实证评估,结果表明EM-Assist优于现有最先进工具:在1%容忍度下,EM-Assist的前5个建议中正确重构方案占比达60.6%,而现有机器学习模型和静态分析工具的对应指标分别为54.2%和52.2%。在复现开源项目中2849个实际提取方法案例时,EM-Assist的召回率达42.1%,远超同类工具的6.5%。此外,我们组织20名工业界开发者开展仓库调查,针对其近期提交的代码提出重构建议,81.3%的受访者认可EM-Assist的推荐方案。这充分验证了本方法的实用价值,标志着大语言模型驱动的重构技术进入新时代。