Compiler optimization is crucial for enhancing program performance by transforming the sequence of optimization passes while maintaining correctness. Despite the promising potential of large language models (LLMs)-based agent for software optimization, automating compiler optimization remains challenging due to: (1) semantic misalignment between abstract program representations and concrete optimization passes, (2) inefficient interaction mechanisms between agents and compiler environments, and (3) reward sparsity from the extensive decision-making process within large optimization spaces. This paper introduces \textbf{AwareCompiler}, an agentic framework for compiler optimization that addresses these challenges through three key innovations: structured knowledge integration and dataset construction, knowledge-driven adaptive pass generation, and data-driven hybrid training pipeline. Experimental results on standard benchmarks demonstrate that AwareCompiler significantly outperforms existing baselines in both performance and efficiency, highlighting the effectiveness of our synergistic knowledge-data-driven approach. Our code is publicly available at https://github.com/LHY-24/AwareCompiler.
翻译:编译器优化通过调整优化遍序列以提升程序性能,同时确保正确性。尽管基于大语言模型(LLM)的智能体在软件优化方面展现出潜力,但自动化编译器优化仍面临以下挑战:(1)抽象程序表示与具体优化遍之间的语义失配;(2)智能体与编译器环境间的低效交互机制;(3)庞大优化空间中决策过程导致的奖励稀疏性。本文提出 **AwareCompiler**,一种面向编译器优化的智能体框架,通过三项关键创新应对上述挑战:结构化知识集成与数据集构建、知识驱动的自适应优化遍生成,以及数据驱动的混合训练流程。在标准基准测试上的实验结果表明,AwareCompiler 在性能与效率方面均显著超越现有基线方法,验证了我们协同知识-数据驱动方法的有效性。代码已开源:https://github.com/LHY-24/AwareCompiler。