The rapid progress of artificial intelligence increasingly relies on efficient integrated circuit (IC) design. Recent studies have explored the use of large language models (LLMs) for generating Register Transfer Level (RTL) code, but existing benchmarks mainly evaluate syntactic correctness rather than optimization quality in terms of power, performance, and area (PPA). This work introduces RTL-OPT, a benchmark for assessing the capability of LLMs in RTL optimization. RTL-OPT contains 36 handcrafted digital designs that cover diverse implementation categories including combinational logic, pipelined datapaths, finite state machines, and memory interfaces. Each task provides a pair of RTL codes, a suboptimal version and a human-optimized reference that reflects industry-proven optimization patterns not captured by conventional synthesis tools. Furthermore, RTL-OPT integrates an automated evaluation framework to verify functional correctness and quantify PPA improvements, enabling standardized and meaningful assessment of generative models for hardware design optimization.
翻译:人工智能的快速发展日益依赖于高效集成电路设计。近期研究探索了利用大语言模型生成寄存器传输级代码,但现有基准主要评估语法正确性,而非功耗、性能与面积方面的优化质量。本研究提出RTL-OPT基准,用于评估大语言模型在RTL优化中的能力。RTL-OPT包含36个手工设计的数字电路,涵盖组合逻辑、流水线数据通路、有限状态机和存储器接口等多样化实现类别。每个任务提供一对RTL代码:次优版本和人工优化参考版本,后者体现了传统综合工具未能捕捉的工业级优化模式。此外,RTL-OPT集成了自动化评估框架,可验证功能正确性并量化PPA改进,为硬件设计优化的生成模型提供标准化、有意义的评估体系。