Learning Performance-Improving Code Edits

The waning of Moore's Law has shifted the focus of the tech industry towards alternative methods for continued performance gains. While optimizing compilers are a standard tool to help increase program efficiency, programmers continue to shoulder much responsibility in crafting and refactoring code with better performance characteristics. In this paper, we investigate the ability of large language models (LLMs) to suggest functionally correct, performance improving code edits. We hypothesize that language models can suggest such edits in ways that would be impractical for static analysis alone. We investigate these questions by curating a large-scale dataset of Performance-Improving Edits, PIE. PIE contains trajectories of programs, where a programmer begins with an initial, slower version and iteratively makes changes to improve the program's performance. We use PIE to evaluate and improve the capacity of large language models. Specifically, use examples from PIE to fine-tune multiple variants of CODEGEN, a billion-scale Transformer-decoder model. Additionally, we use examples from PIE to prompt OpenAI's CODEX using a few-shot prompting. By leveraging PIE, we find that both CODEX and CODEGEN can generate performance-improving edits, with speedups of more than 2.5x for over 25% of the programs, for C++ and Python, even after the C++ programs were compiled using the O3 optimization level. Crucially, we show that PIE allows CODEGEN, an open-sourced and 10x smaller model than CODEX, to match the performance of CODEX on this challenging task. Overall, this work opens new doors for creating systems and methods that can help programmers write efficient code.

翻译：摩尔定律的式微已将科技行业的关注点转向寻求持续性能提升的替代方案。尽管优化编译器是提升程序效率的标准工具，但程序员仍需承担大量责任来编写和重构具有更优性能特征的代码。本文研究大型语言模型提出功能正确且性能提升代码编辑的能力。我们假设语言模型能够以静态分析难以企及的方式提出此类编辑建议。为探究这些问题，我们构建了大规模性能提升编辑数据集PIE。该数据集包含程序演变轨迹——程序员从初始低效版本出发，通过迭代修改逐步提升程序性能。我们利用PIE评估并增强大型语言模型的能力：具体而言，使用PIE中的样例对CODEGEN（一种十亿参数规模的Transformer解码器模型）的多个变体进行微调；同时采用少样本提示方法，使用PIE样例对OpenAI的CODEX进行优化。借助PIE，我们发现CODEX与CODEGEN均能生成性能提升的代码编辑——对于超过25%的C++和Python程序（即便C++程序经过O3优化级别编译后）仍可实现2.5倍以上的加速。关键的是，我们证明PIE使得比CODEX小10倍的开源模型CODEGEN，在此具有挑战性的任务上能够达到与CODEX相当的性能。总体而言，本研究为构建帮助程序员编写高效代码的系统和工具开辟了新途径。