Hardware verification is one of the most challenging stages of the hardware design process, requiring significant time and resources to ensure a design is fully validated and production-ready. Verification teams aim to maximize design coverage while ensuring correct behavior and alignment with the specification. Coverage closure, which relies on iterative constrained-random and directed testing, is still largely manual and therefore slow and labor-intensive. Recent advances show that the code generation capabilities of Large Language Models (LLMs) can be integrated with external tools to build agentic workflows that autonomously perform hardware design and verification tasks. In this work, we introduce Spec2Cov, an agentic framework that automatically and iteratively generates test stimulus directly from design specifications to accelerate coverage closure. Spec2Cov coordinates interactions between an LLM and a hardware simulator, managing compilation and simulation errors, parsing coverage reports, and feeding results back to the model for refinement. We present features that improve Spec2Cov's effectiveness without additional fine-tuning and evaluate their impact. Across 26 designs of varying size and complexity, including problems from the CVDP benchmark suite, Spec2Cov demonstrates promising performance, achieving 100% coverage on simpler designs and up to 49% on more complex designs.
翻译:硬件验证是硬件设计流程中最具挑战性的阶段之一,需要耗费大量时间和资源来确保设计得到充分验证并可投入生产。验证团队旨在最大化设计覆盖率,同时确保行为正确且符合规范。覆盖率收敛依赖于迭代式约束随机测试和定向测试,目前仍主要依靠人工操作,因此过程缓慢且劳动密集。最新研究表明,大语言模型的代码生成能力可与外部工具集成,构建能够自主执行硬件设计与验证任务的智能体工作流。本文提出Spec2Cov——一种智能体框架,该框架可直接从设计规范中自动、迭代地生成测试激励,以加速覆盖率收敛。Spec2Cov协调大语言模型与硬件模拟器之间的交互,处理编译与仿真错误,解析覆盖率报告,并将结果反馈给模型进行优化。我们提出了无需额外微调即可提升Spec2Cov效能的特性,并评估了其影响。在涵盖不同规模与复杂度的26个设计中(包括CVDP基准测试套件中的问题),Spec2Cov展现出令人期待的性能:在简单设计上达到100%覆盖率,在更复杂的设计上达到49%覆盖率。