Compiler auto-tuning faces a dichotomy between traditional black-box search methods, which lack semantic guidance, and recent Large Language Model (LLM) approaches, which often suffer from superficial pattern matching and causal opacity. In this paper, we introduce ECCO, a framework that bridges interpretable reasoning with combinatorial search. We first propose a reverse engineering methodology to construct a Chain-of-Thought dataset, explicitly mapping static code features to verifiable performance evidence. This enables the model to learn the causal logic governing optimization decisions rather than merely imitating sequences. Leveraging this interpretable prior, we design a collaborative inference mechanism where the LLM functions as a strategist, defining optimization intents that dynamically guide the mutation operations of a genetic algorithm. Experimental results on seven datasets demonstrate that ECCO significantly outperforms the LLVM opt -O3 baseline, achieving an average 24.44% reduction in cycles.
翻译:编译器自动调优面临传统黑箱搜索方法与近期大语言模型(LLM)方法之间的二元对立:前者缺乏语义引导,后者常受限于表面模式匹配与因果不透明性。本文提出ECCO框架,将可解释推理与组合搜索相融合。我们首先提出一种逆向工程方法,构建思维链数据集,显式地将静态代码特征映射至可验证的性能证据。这使得模型能够学习支配优化决策的因果逻辑,而非仅仅模仿操作序列。基于这一可解释先验,我们设计了一种协同推理机制:LLM充当策略制定者,定义优化意图以动态指导遗传算法的变异操作。在七个数据集上的实验结果表明,ECCO显著优于LLVM opt -O3基线,平均实现24.44%的周期数降低。