Unit tests are critical in the hardware design lifecycle to ensure that component design modules are functionally correct and conform to the specification before they are integrated at the system level. Thus developing unit tests targeting various design features requires deep understanding of the design functionality and creativity. When one or more unit tests expose a design failure, the debugging engineer needs to diagnose, localize, and debug the failure to ensure design correctness, which is often a painstaking and intense process. In this work, we introduce LAUDE, a unified unit-test generation and debugging framework for hardware designs that cross-pollinates the semantic understanding of the design source code with the Chain-of-Thought (CoT) reasoning capabilities of foundational Large-Language Models (LLMs). LAUDE integrates prompt engineering and design execution information to enhance its unit test generation accuracy and code debuggability. We apply LAUDE with closed- and open-source LLMs to a large corpus of buggy hardware design codes derived from the VerilogEval dataset, where generated unit tests detected bugs in up to 100% and 93% of combinational and sequential designs and debugged up to 93% and 84% of combinational and sequential designs, respectively.
翻译:单元测试在硬件设计生命周期中至关重要,以确保组件设计模块在系统级集成前功能正确且符合规范。因此,针对各类设计特性开发单元测试需要深入理解设计功能并具备创造性。当一个或多个单元测试暴露设计故障时,调试工程师需诊断、定位并修复故障以确保设计正确性,这一过程通常耗时费力。本文提出LAUDE,一种面向硬件设计的统一单元测试生成与调试框架,该框架将设计源代码的语义理解与基础大语言模型的思维链推理能力深度融合。LAUDE集成提示工程与设计执行信息,以提升其单元测试生成准确性与代码可调试性。我们采用闭源与开源大语言模型,将LAUDE应用于源自VerilogEval数据集的缺陷硬件设计代码大型语料库。实验表明,生成的单元测试在组合电路与时序电路设计中分别检测出高达100%与93%的缺陷,并分别成功调试了93%与84%的组合电路与时序电路设计。