Effective code optimization in compilers is crucial for computer and software engineering. The success of these optimizations primarily depends on the selection and ordering of the optimization passes applied to the code. While most compilers rely on a fixed sequence of optimization passes, current methods to find the optimal sequence either employ impractically slow search algorithms or learning methods that struggle to generalize to code unseen during training. We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization. CompilerDream comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies. By training on a large-scale program dataset, CompilerDream is equipped to serve as a general code optimizer across various application scenarios and source-code languages. Our extensive experiments first highlight CompilerDream's strong optimization capabilities for autotuning, where it leads the CompilerGym leaderboard. More importantly, the zero-shot generalization ability of large-scale trained compiler world model and agent, excels across diverse datasets, surpassing LLVM's built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.
翻译:编译器中的有效代码优化对计算机与软件工程至关重要。这些优化的成功主要取决于应用于代码的优化通道的选择与排序。尽管大多数编译器依赖固定的优化通道序列,当前寻找最优序列的方法要么采用不切实际的缓慢搜索算法,要么采用难以泛化到训练期间未见代码的学习方法。我们提出了CompilerDream,一种基于模型的强化学习方法,用于通用代码优化。CompilerDream包含一个准确模拟优化通道内在特性的编译器世界模型,以及一个在该模型上训练以生成有效优化策略的智能体。通过在大型程序数据集上进行训练,CompilerDream能够作为适用于各种应用场景和源代码语言的通用代码优化器。我们的大量实验首先突显了CompilerDream在自动调优方面的强大优化能力,使其在CompilerGym排行榜上领先。更重要的是,大规模训练的编译器世界模型与智能体所具备的零样本泛化能力,在多样化数据集上表现卓越,在值预测和端到端代码优化两种设置下均超越了LLVM的内置优化及其他最先进方法。