Large language models (LLMs), pre-trained or fine-tuned on large code corpora, have shown effectiveness in generating code completions. However, in LLM-based code completion, LLMs may struggle to use correct and up-to-date Application Programming Interfaces (APIs) due to the rapid and continuous evolution of libraries. While existing studies have highlighted issues with predicting incorrect APIs, the specific problem of deprecated API usage in LLM-based code completion has not been thoroughly investigated. To address this gap, we conducted the first evaluation study on deprecated API usage in LLM-based code completion. This study involved seven advanced LLMs, 145 API mappings from eight popular Python libraries, and 28,125 completion prompts. The study results reveal the status quo (i.e., API usage plausibility and deprecated usage rate) of deprecated API and replacing API usage in LLM-based code completion from the perspectives of model, prompt, and library, and indicate the root causes behind. Based on these findings, we propose two lightweight fixing approaches, REPLACEAPI and INSERTPROMPT, which can serve as baseline approaches for future research on mitigating deprecated API usage in LLM-based completion. Additionally, we provide implications for future research on integrating library evolution with LLM-driven software development.
翻译:基于大规模代码语料库预训练或微调的大型语言模型(LLM)在生成代码补全方面已展现出显著效果。然而,在基于LLM的代码补全过程中,由于库的快速持续演化,LLM可能难以准确使用正确且最新的应用程序编程接口(API)。尽管现有研究已指出模型预测错误API的问题,但针对基于LLM的代码补全中已弃用API使用的具体问题尚未得到深入探究。为填补这一研究空白,我们首次对基于LLM的代码补全中的已弃用API使用情况进行了系统性评估研究。本研究涵盖七种先进LLM模型、来自八个流行Python库的145组API映射关系以及28,125个补全提示。研究结果从模型、提示和库三个维度,揭示了基于LLM的代码补全中已弃用API及其替代API的使用现状(包括API使用合理性与弃用率),并剖析了其根本成因。基于这些发现,我们提出了两种轻量级修复方案——REPLACEAPI与INSERTPROMPT,可作为未来研究缓解LLM补全中已弃用API使用问题的基线方法。此外,本研究为未来探索库演化与LLM驱动软件开发相结合的后续研究提供了重要启示。