Large Language Models (LLMs) have demonstrated significant capability in code generation, but their potential in code efficiency optimization remains underexplored. Previous LLM-based code efficiency optimization approaches exclusively focus on function-level optimization and overlook interaction between functions, failing to generalize to real-world development scenarios. Code editing techniques show great potential for conducting project-level optimization, yet they face challenges associated with invalid edits and suboptimal internal functions. To address these gaps, we propose Peace, a novel hybrid framework for Project-level code Efficiency optimization through Automatic Code Editing, which also ensures the overall correctness and integrity of the project. Peace integrates three key phases: dependency-aware optimizing function sequence construction, valid associated edits identification, and efficiency optimization editing iteration. To rigorously evaluate the effectiveness of Peace, we construct PeacExec, the first benchmark comprising 146 real-world optimization tasks from 47 high-impact GitHub Python projects, along with highly qualified test cases and executable environments. Extensive experiments demonstrate Peace's superiority over the state-of-the-art baselines, achieving a 69.2% correctness rate (pass@1), +46.9% opt rate, and 0.840 speedup in execution efficiency. Notably, our Peace outperforms all baselines by significant margins, particularly in complex optimization tasks with multiple functions. Moreover, extensive experiments are also conducted to validate the contributions of each component in Peace, as well as the rationale and effectiveness of our hybrid framework design.
翻译:大型语言模型(LLM)在代码生成方面已展现出显著能力,但其在代码效率优化方面的潜力仍未得到充分探索。以往基于LLM的代码效率优化方法仅关注函数级优化,忽略了函数间的交互作用,难以推广到实际开发场景。代码编辑技术在实现项目级优化方面展现出巨大潜力,但仍面临无效编辑和内部函数次优等挑战。为填补这些空白,我们提出Peace——一种通过自动代码编辑实现项目级代码效率优化的新型混合框架,该框架同时确保项目的整体正确性与完整性。Peace整合了三个关键阶段:依赖感知的优化函数序列构建、有效关联编辑识别以及效率优化编辑迭代。为严格评估Peace的有效性,我们构建了PeacExec——首个包含来自47个高影响力GitHub Python项目的146个真实优化任务的基准测试集,并配备了高质量测试用例与可执行环境。大量实验表明,Peace在各项指标上均显著优于现有最先进基线方法,实现了69.2%的正确率(pass@1)、+46.9%的优化率以及0.840的执行效率加速比。值得注意的是,Peace在所有基线方法中均取得显著优势,尤其在涉及多函数的复杂优化任务中表现突出。此外,我们还通过大量实验验证了Peace各组成部分的贡献,以及混合框架设计的合理性与有效性。