Together We Go Further: LLMs and IDE Static Analysis for Extract Method Refactoring

Long methods that encapsulate multiple responsibilities within a single method are challenging to maintain. Choosing which statements to extract into new methods has been the target of many research tools. Despite steady improvements, these tools often fail to generate refactorings that align with developers' preferences and acceptance criteria. Given that Large Language Models (LLMs) have been trained on large code corpora, if we harness their familiarity with the way developers form functions, we could suggest refactorings that developers are likely to accept. In this paper, we advance the science and practice of refactoring by synergistically combining the insights of LLMs with the power of IDEs to perform Extract Method (EM). Our formative study on 1752 EM scenarios revealed that LLMs are very effective for giving expert suggestions, yet they are unreliable: up to 76.3% of the suggestions are hallucinations. We designed a novel approach that removes hallucinations from the candidates suggested by LLMs, then further enhances and ranks suggestions based on static analysis techniques from program slicing, and finally leverages the IDE to execute refactorings correctly. We implemented this approach in an IntelliJ IDEA plugin called EM-Assist. We empirically evaluated EM-Assist on a diverse corpus that replicates 1752 actual refactorings from open-source projects. We found that EM-Assist outperforms previous state of the art tools: EM-Assist suggests the developerperformed refactoring in 53.4% of cases, improving over the recall rate of 39.4% for previous best-in-class tools. Furthermore, we conducted firehouse surveys with 16 industrial developers and suggested refactorings on their recent commits. 81.3% of them agreed with the recommendations provided by EM-Assist.

翻译：长方法将多个职责封装在单一方法中，难以维护。如何选择哪些语句应提取至新方法，一直是许多研究工具的目标。尽管这些工具持续改进，但它们常常无法生成符合开发者偏好和验收标准的重构方案。鉴于大语言模型（LLM）已在大量代码语料库上训练，若能利用其对开发者构建函数习惯的熟悉程度，我们就能提出开发者更可能接受的重构建议。本文通过协同融合LLM的洞察力与IDE的强大能力来执行提取方法（EM），推动了重构科学与实践的进步。我们对1752个EM场景的形成性研究发现，LLM在提供专家建议方面非常有效，但存在不可靠性：高达76.3%的建议是幻觉。我们设计了一种新颖方法，首先剔除LLM建议候选中的幻觉，然后基于程序切片静态分析技术进一步增强和排序建议，最终借助IDE正确执行重构。我们在IntelliJ IDEA插件中实现了该方法，称为EM-Assist。通过模拟开源项目中1752个实际重构的多样化语料库，我们对EM-Assist进行了实证评估。结果显示，EM-Assist优于现有最先进工具：在53.4%的案例中，EM-Assist建议了开发者实际执行的重构，相比此前最佳工具的39.4%召回率有所提升。此外，我们针对16位工业开发者近期提交的代码开展了消防演习调查，其中81.3%的开发者认可了EM-Assist提供的建议。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日