We show for invertible problems that transform data from a source domain (for example, Logic Condition Tables (LCTs)) to a destination domain (for example, Hardware Description Language (HDL) code), an approach of using Large Language Models (LLMs) as a lossless encoder from source to destination followed by as a lossless decoder back to the source, comparable to lossless compression in information theory, can mitigate most of the LLM drawbacks of hallucinations and omissions. Specifically, using LCTs as inputs, we generate the full HDL for a two-dimensional network-on-chip router (13 units, 1500-2000 lines of code) using seven different LLMs, reconstruct the LCTs from the auto-generated HDL, and compare the original and reconstructed LCTs. This approach yields significant productivity improvements, not only confirming correctly generated LLM logic and detecting incorrectly generated LLM logic but also assisting developers in finding design specification errors.
翻译:我们针对将数据从源域(例如逻辑条件表LCTs)转换至目标域(例如硬件描述语言HDL代码)的可逆问题,展示了一种方法:使用大语言模型(LLMs)作为从源域到目标域的无损编码器,再将其作为从目标域返回源域的无损解码器(类似于信息论中的无损压缩),可有效缓解LLM在幻觉与遗漏方面的大部分缺陷。具体而言,我们以LCTs为输入,使用七种不同LLM生成了二维片上网络路由器(13个单元,1500-2000行代码)的完整HDL代码,从自动生成的HDL中重建LCT,并对比原始与重建后的LCT。该方法不仅能够验证正确生成的LLM逻辑、检测错误生成的LLM逻辑,还可辅助开发者发现设计规范中的错误,从而显著提升生产力。