The success of language models in code assistance has spurred the proposal of repository-level code completion as a means to enhance prediction accuracy, utilizing the context from the entire codebase. However, this amplified context can inadvertently increase inference latency, potentially undermining the developer experience and deterring tool adoption - a challenge we termed the Context-Latency Conundrum. This paper introduces REPOFUSE, a pioneering solution designed to enhance repository-level code completion without the latency trade-off. REPOFUSE uniquely fuses two types of context: the analogy context, rooted in code analogies, and the rationale context, which encompasses in-depth semantic relationships. We propose a novel rank truncated generation (RTG) technique that efficiently condenses these contexts into prompts with restricted size. This enables REPOFUSE to deliver precise code completions while maintaining inference efficiency. Through testing with the CrossCodeEval suite, REPOFUSE has demonstrated a significant leap over existing models, achieving a 40.90% to 59.75% increase in exact match (EM) accuracy for code completions and a 26.8% enhancement in inference speed. Beyond experimental validation, REPOFUSE has been integrated into the workflow of a large enterprise, where it actively supports various coding tasks.
翻译:语言模型在代码辅助方面的成功推动了仓库级代码补全的提出,通过利用整个代码库的上下文来提升预测准确性。然而,这种扩展的上下文可能无意中增加推理延迟,影响开发者体验并阻碍工具采用——我们将其称为“上下文-延迟困境”。本文提出REPOFUSE,这是一种创新的解决方案,旨在在不牺牲延迟的情况下增强仓库级代码补全。REPOFUSE独特地融合了两类上下文:基于代码类比的类比上下文,以及涵盖深层语义关系的推理上下文。我们提出了一种新颖的排名截断生成(RTG)技术,能够高效地将这些上下文压缩为受限大小的提示,从而使REPOFUSE在保持推理效率的同时提供精确的代码补全。通过CrossCodeEval套件测试,REPOFUSE相比现有模型显著提升,代码补全的精确匹配(EM)准确率提高了40.90%至59.75%,推理速度提升了26.8%。除了实验验证外,REPOFUSE已被集成到大型企业的工作流中,积极支持多种编码任务。