Learning Performance-Improving Code Edits

The waning of Moore's Law has shifted the focus of the tech industry towards alternative methods for continued performance gains. While optimizing compilers are a standard tool to help increase program efficiency, programmers continue to shoulder much responsibility in crafting and refactoring code with better performance characteristics. In this paper, we investigate the ability of large language models (LLMs) to suggest functionally correct, performance improving code edits. We hypothesize that language models can suggest such edits in ways that would be impractical for static analysis alone. We investigate these questions by curating a large-scale dataset of Performance-Improving Edits, PIE. PIE contains trajectories of programs, where a programmer begins with an initial, slower version and iteratively makes changes to improve the program's performance. We use PIE to evaluate and improve the capacity of large language models. Specifically, use examples from PIE to fine-tune multiple variants of CODEGEN, a billion-scale Transformer-decoder model. Additionally, we use examples from PIE to prompt OpenAI's CODEX using a few-shot prompting. By leveraging PIE, we find that both CODEX and CODEGEN can generate performance-improving edits, with speedups of more than 2.5x for over 25% of the programs, for C++ and Python, even after the C++ programs were compiled using the O3 optimization level. Crucially, we show that PIE allows CODEGEN, an open-sourced and 10x smaller model than CODEX, to match the performance of CODEX on this challenging task. Overall, this work opens new doors for creating systems and methods that can help programmers write efficient code.

翻译：摩尔定律的衰退已将技术行业的焦点转向替代方法以持续提升性能。尽管优化编译器是帮助提高程序效率的标准工具，但程序员仍需承担大量责任来编写和重构具有更好性能特征的代码。在本文中，我们探究大语言模型提出功能正确且性能改进的代码编辑的能力。我们假设语言模型能够以静态分析单独难以实现的方式提出这样的编辑。为探究这些问题，我们整理了一个大规模的性能改进编辑数据集PIE。PIE包含程序的轨迹，其中程序员从初始的慢速版本开始，并迭代进行更改以提升程序性能。我们利用PIE来评估和改进大语言模型的能力。具体而言，我们使用PIE中的示例对CODEGEN（一个十亿规模的Transformer解码器模型）的多个变体进行微调。此外，我们利用PIE中的示例通过少量示例提示来引导OpenAI的CODEX。通过利用PIE，我们发现CODEX和CODEGEN均能生成性能改进的编辑，对超过25%的C++和Python程序实现了超过2.5倍的加速，即便这些C++程序在使用O3优化级别编译后也是如此。至关重要的是，我们证明PIE使得开源且规模比CODEX小10倍的CODEGEN能够在这一艰巨任务上与CODEX表现相当。总体而言，这项工作为创建帮助程序员编写高效代码的系统和方法开辟了新途径。