Modern AI agents optimize programs by refactoring source code to trigger trusted compiler transformations. This preserves program semantics and reduces source code pollution, making the program easier to maintain and portable across architectures. However, this collaborative workflow is limited by legacy compiler interfaces, which obscure analysis behind unstructured, lossy optimization remarks that have been designed for human intuition rather than machine logic. Using the TSVC benchmark, we evaluate the efficacy of existing optimization feedback. We find that while precise remarks provide actionable feedback (3.3x success rate), ambiguous remarks are actively detrimental, triggering semantic-breaking hallucinations. By replacing ambiguous remarks with precise ones, we show that structured, precise analysis information unlocks the capabilities of small models, proving that the bottleneck is the interface, not the agent. We conclude that future compilers must expose structured, actionable feedback designed specifically for the future of autonomous performance engineering.
翻译:现代AI代理通过重构源代码来触发可信的编译器变换,从而优化程序。这种方式既能保持程序语义,又能减少源码污染,使程序更易于维护且具有跨架构可移植性。然而,这种协作工作流受限于传统编译器接口——其优化反馈被设计为面向人类直觉而非机器逻辑,导致分析信息隐藏在非结构化、有损的注解中。基于TSVC基准测试,我们评估了现有优化反馈的有效性。研究发现:精准反馈可提供3.3倍成功率的高效指导,而模糊反馈则会引发破坏语义的幻觉,产生实际危害。通过将模糊反馈替换为精准反馈,我们证实结构化、精确的分析信息能释放小模型的能力,证明瓶颈在于接口而非代理本身。我们得出结论:未来编译器必须暴露专为自主性能工程设计的结构化、可操作反馈。