Recent advances in large language models (LLMs) have sparked growing interest in applying them to hardware design automation, particularly for accurate RTL code generation. Prior efforts follow two largely independent paths: (i) training domain-adapted RTL models to internalize hardware semantics, (ii) developing agentic systems that leverage frontier generic LLMs guided by simulation feedback. However, these two paths exhibit complementary strengths and weaknesses. In this work, we present ACE-RTL that unifies both directions through Agentic Context Evolution (ACE). ACE-RTL integrates an RTL-specialized LLM, trained on a large-scale dataset of 1.7 million RTL samples, with a frontier reasoning LLM through three synergistic components: the generator, reflector, and coordinator. These components iteratively refine RTL code toward functional correctness. We further introduce a parallel scaling strategy that significantly reduces the number of iterations required to reach correct solutions. On the Comprehensive Verilog Design Problems (CVDP) benchmark, ACE-RTL achieves up to a 44.87% pass rate improvement over 14 competitive baselines while requiring only four iterations on average.
翻译:近年来,大语言模型(LLM)的进展引发了将其应用于硬件设计自动化的日益增长的兴趣,特别是在生成准确的RTL代码方面。先前的研究主要遵循两条相对独立的路径:(i)训练领域适应的RTL模型以内化硬件语义;(ii)开发基于仿真反馈引导、利用前沿通用LLM的智能体系统。然而,这两条路径展现出互补的优势与不足。在本工作中,我们提出了ACE-RTL,它通过智能体上下文演化(ACE)将两个方向统一起来。ACE-RTL将一个在大规模数据集(包含170万个RTL样本)上训练的RTL专用LLM,与一个前沿推理LLM通过三个协同组件——生成器、反思器和协调器——进行集成。这些组件迭代地精炼RTL代码,以实现功能正确性。我们进一步引入了一种并行扩展策略,该策略显著减少了达到正确解所需的迭代次数。在综合Verilog设计问题(CVDP)基准测试中,ACE-RTL相较于14个竞争基线实现了高达44.87%的通过率提升,而平均仅需四次迭代。