Recently, large language models (LLMs) have demonstrated excellent performance in understanding human instructions and generating code, which has inspired researchers to explore the feasibility of generating RTL code with LLMs. However, the existing approaches to fine-tune LLMs on RTL codes typically are conducted on fixed datasets, which do not fully stimulate the capability of LLMs and require large amounts of reference data. To mitigate these issues , we introduce a simple yet effective iterative training paradigm named ITERTL. During each iteration, samples are drawn from the model trained in the previous cycle. Then these new samples are employed for training in this loop. Through this iterative approach, the distribution mismatch between the model and the training samples is reduced. Additionally, the model is thus enabled to explore a broader generative space and receive more comprehensive feedback. Theoretical analyses are conducted to investigate the mechanism of the effectiveness. Experimental results show the model trained through our proposed approach can compete with and even outperform the state-of-the-art (SOTA) open-source model with nearly 37\% reference samples, achieving remarkable 42.9\% and 62.2\% pass@1 rate on two VerilogEval evaluation datasets respectively. While using the same amount of reference samples, our method can achieved a relative improvement of 16.9\% and 12.5\% in pass@1 compared to the non-iterative method. This study facilitates the application of LLMs for generating RTL code in practical scenarios with limited data.
翻译:近年来,大型语言模型(LLMs)在理解人类指令和生成代码方面展现出卓越性能,这促使研究者探索利用LLMs生成RTL代码的可行性。然而,现有在RTL代码上微调LLMs的方法通常基于固定数据集进行,这未能充分激发LLMs的潜力,且需要大量参考数据。为缓解这些问题,我们提出了一种简单而有效的迭代训练范式,命名为ITERTL。在每次迭代中,从前一周期训练的模型中采样生成样本,随后将这些新样本用于当前循环的训练。通过这种迭代方式,模型与训练样本之间的分布失配得以减少。此外,模型因此能够探索更广阔的生成空间并获得更全面的反馈。我们进行了理论分析以探究其有效性的机制。实验结果表明,通过我们提出的方法训练的模型,仅使用近37%的参考样本,即可与当前最先进(SOTA)的开源模型竞争甚至超越,在两个VerilogEval评估数据集上分别实现了显著的42.9%和62.2%的pass@1通过率。在使用相同数量参考样本的情况下,与非迭代方法相比,我们的方法在pass@1上实现了16.9%和12.5%的相对提升。本研究推动了LLMs在数据有限的实际场景中生成RTL代码的应用。