Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies. Besides, the existing benchmarks usually focus on limited code completion scenarios, which cannot reflect the repository-level code completion abilities well of existing methods. To address these limitations, we propose the R2C2-Coder to enhance and benchmark the real-world repository-level code completion abilities of code Large Language Models, where the R2C2-Coder includes a code prompt construction method R2C2-Enhance and a well-designed benchmark R2C2-Bench. Specifically, first, in R2C2-Enhance, we first construct the candidate retrieval pool and then assemble the completion prompt by retrieving from the retrieval pool for each completion cursor position. Second, based on R2C2 -Enhance, we can construct a more challenging and diverse R2C2-Bench with training, validation and test splits, where a context perturbation strategy is proposed to simulate the real-world repository-level code completion well. Extensive results on multiple benchmarks demonstrate the effectiveness of our R2C2-Coder.
翻译:近年来,代码补全模型取得了显著进展。在现代软件开发中,仓库级代码补全日益受到关注,并已提出了若干基线方法与评测基准。然而,现有的仓库级代码补全方法往往未能充分利用项目仓库的广泛上下文信息,例如相关文件的复杂结构与类层次关系。此外,现有基准通常局限于有限的代码补全场景,难以准确反映现有方法的仓库级代码补全能力。为应对这些局限性,我们提出了R2C2-Coder,旨在增强并系统评测代码大语言模型在真实世界仓库级代码补全中的能力。R2C2-Coder包含代码提示构建方法R2C2-Enhance与精心设计的评测基准R2C2-Bench。具体而言,首先,在R2C2-Enhance中,我们构建候选检索池,随后通过从检索池中检索信息,为每个补全光标位置组装生成补全提示。其次,基于R2C2-Enhance,我们构建了更具挑战性与多样性的R2C2-Bench,包含训练、验证与测试划分,并提出了一种上下文扰动策略以更好地模拟真实世界中的仓库级代码补全场景。在多个基准上的广泛实验结果验证了我们R2C2-Coder的有效性。