Traditionally, designs are written in Verilog hardware description language (HDL) and debugged by hardware engineers. While this approach is effective, it is time-consuming and error-prone for complex designs. Large language models (LLMs) are promising in automating HDL code generation. LLMs are trained on massive datasets of text and code, and they can learn to generate code that compiles and is functionally accurate. We aim to evaluate the ability of LLMs to generate functionally correct HDL models. We build AutoChip by combining the interactive capabilities of LLMs and the output from Verilog simulations to generate Verilog modules. We start with a design prompt for a module and the context from compilation errors and debugging messages, which highlight differences between the expected and actual outputs. This ensures that accurate Verilog code can be generated without human intervention. We evaluate AutoChip using problem sets from HDLBits. We conduct a comprehensive analysis of the AutoChip using several LLMs and problem categories. The results show that incorporating context from compiler tools, such as Icarus Verilog, improves the effectiveness, yielding 24.20% more accurate Verilog. We release our evaluation scripts and datasets as open-source contributions at the following link https://github.com/shailja-thakur/AutoChip.
翻译:传统上,设计采用Verilog硬件描述语言编写并由硬件工程师调试。这种方法虽有效,但面对复杂设计时耗时且易出错。大语言模型在自动化HDL代码生成方面展现出潜力。经过海量文本与代码数据集训练的LLM,能够学习生成可编译且功能准确的代码。本研究旨在评估LLM生成功能正确的HDL模型的能力。通过结合LLM的交互能力与Verilog仿真输出,我们构建了AutoChip系统以生成Verilog模块。该系统以设计模块的提示为起点,利用编译错误与调试信息(突出预期输出与实际输出的差异)作为上下文,确保无需人工干预即可生成准确的Verilog代码。我们采用HDLBits的题目集对AutoChip进行评估,并针对多种LLM及问题类别开展全面分析。结果表明,引入如Icarus Verilog等编译器工具的上下文可将Verilog生成准确率提升24.20%。我们已以开源形式发布评估脚本及数据集,访问链接如下:https://github.com/shailja-thakur/AutoChip。