Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing spatial complexity, potentially surpassing the efficacy of natural-language-only inputs. Expanding upon this premise, our paper introduces an open-source benchmark for multi-modal generative models tailored for Verilog synthesis from visual-linguistic inputs, addressing both singular and complex modules. Additionally, we introduce an open-source visual and natural language Verilog query language framework to facilitate efficient and user-friendly multi-modal queries. To evaluate the performance of the proposed multi-modal hardware generative AI in Verilog generation tasks, we compare it with a popular method that relies solely on natural language. Our results demonstrate a significant accuracy improvement in the multi-modal generated Verilog compared to queries based solely on natural language. We hope to reveal a new approach to hardware design in the large-hardware-design-model era, thereby fostering a more diversified and productive approach to hardware design.
翻译:自然语言接口通过利用大语言模型,在从高层规范自动化生成Verilog方面展现出巨大潜力,已引起广泛关注。然而,本文阐明,对于具有空间复杂性的硬件架构,视觉表示能提供对设计意图至关重要的上下文信息,其效果可能超越纯自然语言输入。基于此前提,本文提出一个开源基准测试,专为从视觉-语言输入合成Verilog的多模态生成模型而设计,涵盖简单与复杂模块。此外,我们引入一个开源的视觉与自然语言Verilog查询语言框架,以促进高效且用户友好的多模态查询。为评估所提出的多模态硬件生成式AI在Verilog生成任务中的性能,我们将其与一种仅依赖自然语言的流行方法进行比较。结果表明,与纯自然语言查询相比,多模态生成的Verilog在准确性上有显著提升。我们希望在大规模硬件设计模型时代,为硬件设计揭示一种新途径,从而促进硬件设计方法向更多样化、更高产的方向发展。