Automated testbench generation has become a critical bottleneck in large language model (LLM)-driven Register Transfer Level (RTL) workflows, where large numbers of candidate designs must be verified rapidly and reliably. Existing prompt-based approaches treat testbench generation as unconstrained code synthesis, yielding stochastic outputs with high token cost, low reproducibility, and insufficient coverage. To address this gap, we present STG, a Structured Testbench Generation framework that exploits the inherent structure of hardware designs to generate deterministic testbenches. As a direct verification tool, STG runs 720x faster than an iterative LLM-based testbench generation flow and higher rate of successful compilation, achieves higher coverage, and reduces false-pass verdicts on incorrect DUTs. STG also helps identify errors in RTL generation benchmarks by exposing faulty benchmark testbenches. As a data curation engine, it is 11x faster than LLM-based filtering on a single CPU core with 127x less energy, and the resulting distilled models provide state-of-the-art performance in our multi-benchmark evaluation. As a test-time scaling oracle, it reduces node count by 14-47\%. Our models are available at https://huggingface.co/collections/AS-SiliconMind/siliconmind-v12.
翻译:自动化测试平台生成已成为大型语言模型(LLM)驱动的寄存器传输级(RTL)工作流中的关键瓶颈,这类场景需快速可靠地验证大量候选设计。现有基于提示的方法将测试平台生成视为无约束代码合成,导致输出随机性高、令牌成本高昂、可重复性差且覆盖率不足。为解决该问题,我们提出STG(结构化测试平台生成框架),利用硬件设计的固有结构生成确定性测试平台。作为直接验证工具,STG运行速度比基于LLM的迭代式测试平台生成流程快720倍,编译成功率更高,覆盖率更优,并减少了对错误待测设计(DUT)的误判。STG还能通过暴露有缺陷的基准测试平台来帮助识别RTL生成基准中的错误。作为数据清洗引擎,它在单CPU核心上的速度比基于LLM的过滤方法快11倍,能耗降低127倍,由此生成的蒸馏模型在我们的多基准评估中达到了最先进性能。作为测试时扩展预言机,它将节点数量减少14-47%。我们的模型可通过https://huggingface.co/collections/AS-SiliconMind/siliconmind-v12获取。