Agentic Code Optimization via Compiler-LLM Cooperation

Generating performant executables from high level languages is critical to software performance across a wide range of domains. Modern compilers perform this task by passing code through a series of well-studied optimizations at progressively lower levels of abstraction, but may miss optimization opportunities that require high-level reasoning about a program's purpose. Recent work has proposed using LLMs to fill this gap. While LLMs can achieve large speedups on some programs, they frequently generate code that is incorrect. In this work, we propose a method to balance the correctness of conventional compiler optimizations with the ``creativity'' of LLM-based code generation: compiler-LLM cooperation. Our approach integrates existing compiler optimization passes with LLM-based code generation at multiple levels of abstraction, retaining the best features of both types of code optimization. We realize our approach with a multi-agent system that includes (1) LLM-based optimization agents for each level of abstraction, (2) individual compiler constituents as tools, (3) an LLM-based test generation agent that probes the correctness and performance of generated code, and (4) a guiding LLM that orchestrates the other components. The strategy enables LLM-based optimization of input programs at multiple levels of abstraction and introduces a method for distributing computational budget between levels. Our extensive evaluation shows that compiler-LLM cooperation outperforms both existing compiler optimizations and level-specific LLM-based baselines, producing speedups up to 1.25x.

翻译：从高级语言生成高性能可执行文件是跨领域软件性能的关键。现代编译器通过将代码依次经过一系列经过充分研究的优化过程（逐步降低抽象层次）来完成该任务，但可能错失需要关于程序意图的高层次推理的优化机会。近期工作提出利用大语言模型弥补这一不足。尽管大语言模型能在某些程序上实现显著加速，但时常生成有缺陷的代码。本研究提出一种平衡传统编译器优化的正确性与基于大语言模型代码生成的"创造性"的方法：编译器-大模型协作。该方法在多个抽象层次上融合现有编译器优化流程与大语言模型代码生成技术，保留两类代码优化的最佳特性。我们通过多智能体系统实现该方法，包含：(1) 针对各抽象层次的基于大语言模型的优化智能体；(2) 作为工具的独立编译器组件；(3) 用于探测生成代码正确性与性能的基于大语言模型的测试生成智能体；(4) 协调其他组件的引导型大语言模型。该策略支持在多个抽象层次上对输入程序进行基于大语言模型的优化，并引入跨层次计算预算分配方法。广泛评估表明，编译器-大模型协作在性能上超越现有编译器优化及指定层次的基于大语言模型的基线方法，实现最高1.25倍的加速效果。