The automatic generation of RTL code (e.g., Verilog) using natural language instructions and large language models (LLMs) has attracted significant research interest recently. However, most existing approaches heavily rely on commercial LLMs such as ChatGPT, while open-source LLMs tailored for this specific design generation task exhibit notably inferior performance. The absence of high-quality open-source solutions restricts the flexibility and data privacy of this emerging technique. In this study, we present a new customized LLM solution with a modest parameter count of only 7B, achieving better performance than GPT-3.5 on all representative benchmarks for RTL code generation. Especially, it outperforms GPT-4 in VerilogEval Machine benchmark. This remarkable balance between accuracy and efficiency is made possible by leveraging our new RTL code dataset and a customized LLM algorithm, both of which have been made fully open-source. Furthermore, we have successfully quantized our LLM to 4-bit with a total size of 4GB, enabling it to function on a single laptop with only slight performance degradation. This efficiency allows the RTL generator to serve as a local assistant for engineers, ensuring all design privacy concerns are addressed.
翻译:利用自然语言指令和大型语言模型(LLM)自动生成RTL代码(如Verilog)近年来引起了广泛的研究关注。然而,现有方法大多严重依赖ChatGPT等商业LLM,而针对此特定设计生成任务定制的开源LLM则表现出明显较差的性能。高质量开源解决方案的缺失限制了这一新兴技术的灵活性和数据隐私性。本研究提出了一种新型定制化LLM解决方案,其参数量仅为70亿,在RTL代码生成的所有代表性基准测试中均取得了优于GPT-3.5的性能。特别是在VerilogEval Machine基准测试中超越了GPT-4。这种准确性与效率之间的卓越平衡,得益于我们新构建的RTL代码数据集和定制化LLM算法的协同作用,两者均已实现完全开源。此外,我们成功将LLM量化至4位,总大小仅为4GB,使其能在单台笔记本电脑上运行且仅产生轻微性能损失。这种高效性使得RTL生成器能够作为工程师的本地助手,从而彻底解决所有设计隐私问题。