PipeRTL: Timing-Aware Pipeline Optimization at IR-Level for RTL Generation

Modern hardware compilers increasingly rely on rich intermediate representations (IRs) to preserve optimization-relevant semantics before generating RTL code. However, one important optimization is still largely deferred to backend tools: pipeline optimization. In common RTL flows, registers are inserted by frontend heuristics or hardware designers and later adjusted by backend retiming after the design has been lowered to a much lower-level netlist representation. At that point, much of the operator-level structure originally exposed by the compiler IR has already been weakened or lost, limiting opportunities for global, compiler-level pipeline optimization. This paper presents PipeRTL, an IR-level pipeline optimization framework for hardware compilers, instantiated in CIRCT. PipeRTL makes the legality of register relocation explicit in the IR, uses a learned timing predictor to approximate downstream delay behavior, and formulates timing-aware register relocation as a global min-cost flow problem under timing constraints. Evaluation on open-source designs under a commercial backend synthesis flow shows that PipeRTL improves downstream implementation quality on average, reducing critical-path delay, power, and area across the evaluated benchmarks, while also providing a stronger starting point for backend retiming. These results indicate that exposing pipeline optimization as an explicit compiler pass can deliver backend-meaningful gains by improving the sequential structure presented to later stages and the resulting downstream implementation quality.

翻译：现代硬件编译器日益依赖丰富的中间表示（IR）来保留与优化相关的语义信息，随后再生成RTL代码。然而，一项关键优化——流水线优化——仍大多被推迟至后端工具中处理。在常见的RTL流程中，寄存器由前端启发式算法或硬件设计者插入，并在设计被降级至更低层级网表表示后，由后端重定时调整。此时，编译器IR原本暴露的算子级结构大多已被削弱或丢失，限制了全局性的编译器级别流水线优化机会。本文提出PipeRTL，一个基于CIRCT实例化的硬件编译器IR级流水线优化框架。PipeRTL在IR中显式声明寄存器重定位的合法性，利用学习型时序预测器近似下游延迟行为，并将时序感知的寄存器重定位形式化为一个满足时序约束的全局最小费用流问题。在商用后端综合流程下对开源设计的评估表明，PipeRTL平均改善了下游实现质量，在评估基准上降低了关键路径延迟、功耗和面积，同时为后端重定时提供了更优的起点。这些结果表明，将流水线优化显式化为编译器通行（pass）可通过改善呈现给后续阶段的时序结构及其产生的下游实现质量，带来后端可感知的增益。

相关内容

关注 14

信息检索杂志（IR）为信息检索的广泛领域中的理论、算法分析和实验的发布提供了一个国际论坛。感兴趣的主题包括对应用程序（例如Web，社交和流媒体，推荐系统和文本档案）的搜索、索引、分析和评估。这包括对搜索中人为因素的研究、桥接人工智能和信息检索以及特定领域的搜索应用程序。官网地址：https://dblp.uni-trier.de/db/journals/ir/

生成-过滤-控制-重放：LLM强化学习中Rollout策略的全面综述

专知会员服务

10+阅读 · 5月8日

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

专知会员服务

23+阅读 · 3月30日

【ICLR2025】DynaPrompt：动态测试时提示调优

专知会员服务

10+阅读 · 2025年2月2日

【AAAI2023】DPText-DETR: 基于动态点query的场景文本检测，更高更快更鲁棒

专知会员服务

17+阅读 · 2023年1月23日