Hardware verification is one of the most challenging stages of the hardware design process, requiring significant time and resources to ensure a design is fully validated and production-ready. Verification teams aim to maximize design coverage while ensuring correct behavior and alignment with the specification. Coverage closure, which relies on iterative constrained-random and directed testing, is still largely manual and therefore slow and labor-intensive. Recent advances show that the code generation capabilities of Large Language Models (LLMs) can be integrated with external tools to build agentic workflows that autonomously perform hardware design and verification tasks. In this work, we introduce Spec2Cov, an agentic framework that automatically and iteratively generates test stimulus directly from design specifications to accelerate coverage closure. Spec2Cov coordinates interactions between an LLM and a hardware simulator, managing compilation and simulation errors, parsing coverage reports, and feeding results back to the model for refinement. We present features that improve Spec2Cov's effectiveness without additional fine-tuning and evaluate their impact. Across 26 designs of varying size and complexity, including problems from the CVDP benchmark suite, Spec2Cov demonstrates promising performance, achieving 100% coverage on simpler designs and up to 49% on more complex designs.
翻译:硬件验证是硬件设计流程中最具挑战性的阶段之一,需要耗费大量时间和资源以确保设计充分验证并达到投产标准。验证团队需在确保行为正确且符合规范的前提下最大化设计覆盖率。依赖于迭代式约束随机测试与定向测试的覆盖率收敛过程目前仍高度依赖人工操作,导致效率低下且耗时耗力。最新研究表明,大型语言模型(LLM)的代码生成能力可与外部工具集成,构建自主执行硬件设计与验证任务的智能体工作流。本文提出Spec2Cov——一种智能体框架,能够直接从设计规范中自动迭代生成测试激励,加速覆盖率收敛。Spec2Cov协调LLM与硬件模拟器之间的交互,处理编译与仿真错误、解析覆盖率报告,并将结果反馈给模型进行迭代优化。我们提出无需额外微调即可提升Spec2Cov有效性的特性,并评估其影响。在涵盖CVDP基准测试套件问题在内的26个不同规模与复杂度的设计测试中,Spec2Cov展现出优异性能:对简单设计实现100%覆盖率,对复杂设计覆盖率最高达49%。