Learning Performance-Improving Code Edits

The waning of Moore's Law has shifted the focus of the tech industry towards alternative methods for continued performance gains. While optimizing compilers are a standard tool to help increase program efficiency, programmers continue to shoulder much responsibility in crafting and refactoring code with better performance characteristics. In this paper, we investigate the ability of large language models (LLMs) to suggest functionally correct, performance improving code edits. We hypothesize that language models can suggest such edits in ways that would be impractical for static analysis alone. We investigate these questions by curating a large-scale dataset of Performance-Improving Edits, PIE. PIE contains trajectories of programs, where a programmer begins with an initial, slower version and iteratively makes changes to improve the program's performance. We use PIE to evaluate and improve the capacity of large language models. Specifically, use examples from PIE to fine-tune multiple variants of CODEGEN, a billion-scale Transformer-decoder model. Additionally, we use examples from PIE to prompt OpenAI's CODEX using a few-shot prompting. By leveraging PIE, we find that both CODEX and CODEGEN can generate performance-improving edits, with speedups of more than 2.5x for over 25% of the programs, for C++ and Python, even after the C++ programs were compiled using the O3 optimization level. Crucially, we show that PIE allows CODEGEN, an open-sourced and 10x smaller model than CODEX, to match the performance of CODEX on this challenging task. Overall, this work opens new doors for creating systems and methods that can help programmers write efficient code.

翻译：摩尔定律的衰退已使科技行业将关注点转向其他方法以持续提升性能。虽然优化编译器是提高程序效率的标准工具，但程序员仍需承担大量责任，以编写和重构具有更好性能特性的代码。本文研究大型语言模型（LLMs）提出功能正确且能提升性能的代码编辑的能力。我们假设语言模型能够以静态分析难以企及的方式提出此类编辑。为探究这些问题，我们构建了一个大规模性能提升编辑数据集PIE。PIE包含程序的演变轨迹，其中程序员从较慢的初始版本开始，逐步修改以提升程序性能。我们利用PIE评估并改进大型语言模型的能力：具体而言，使用PIE中的样本微调多个变体的CODEGEN（一种十亿级Transformer解码器模型），同时利用PIE中的样本通过少样本提示驱动OpenAI的CODEX。借助PIE，我们发现CODEX和CODEGEN均能生成性能提升的编辑——在C++和Python程序中，即使C++程序已通过O3优化级别编译，仍有超过25%的程序实现2.5倍以上的加速。关键的是，我们证明PIE使开源且模型规模小10倍的CODEGEN在此挑战性任务中达到与CODEX相当的性能。总体而言，本研究为创建帮助程序员编写高效代码的系统与方法开辟了新途径。