Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks. However, LLMs have rarely been explored for code optimization. In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time. The recently proposed first PIE dataset for performance optimization constructs program optimization pairs based on iterative submissions from the same programmer for the same problem. However, this approach restricts LLMs to local performance improvements, neglecting global algorithmic innovation. Therefore, we adopt a completely different perspective by reconstructing the optimization pairs into a problem-oriented approach. This allows for the integration of various ingenious ideas from different programmers tackling the same problem. Experimental results demonstrate that adapting LLMs to problem-oriented optimization pairs significantly enhances their optimization capabilities. Meanwhile, we identified performance bottlenecks within the problem-oriented perspective. By employing model merge, we further overcame bottlenecks and ultimately elevated the program optimization ratio ($51.76\%\rightarrow76.65\%$) and speedup ($2.65\times\rightarrow5.09\times$) to new levels.
翻译:大型语言模型(LLM)在解决各类编程任务中展现出强大能力,然而其在代码优化领域的应用尚未得到充分探索。本文聚焦于以性能提升为目标的代码优化,具体旨在通过优化实现最短执行时间。近期提出的首个性能优化数据集PIE基于同一程序员针对同一问题的迭代式提交构建程序优化对,但该方法将LLM限制于局部性能改进,忽略了全局算法创新。为此,我们采用完全不同的视角,将优化对重构为面向问题的方法,从而能够融合不同程序员解决同一问题的多样化创新思路。实验结果表明,使LLM适配面向问题的优化对可显著提升其优化能力。同时,我们在面向问题的视角中识别出性能瓶颈,通过模型融合技术进一步突破瓶颈,最终将程序优化率($51.76\%\rightarrow76.65\%$)与加速比($2.65\times\rightarrow5.09\times$)提升至新高度。