Modern EDA flows rely heavily on Tcl scripting, yet general LLMs perform poorly in this domain due to extreme data scarcity, domain-specific semantics, and the high reliability required in physical design. We present iScript, a domain-adapted Qwen3-8B model for Innovus Tcl script generation, and iScript-Bench, a comprehensive benchmark covering five task categories and three difficulty levels. To overcome the lack of training data, we introduce a multi-stage data synthesis pipeline that integrates command extraction, static linting, requirement back-inference, and Chain-of-Thought generation, producing a 10K-tuple (requirement, CoT, script) dataset. iScript is trained through a two-stage strategy combining domain-adaptive pretraining and supervised fine-tuning. To evaluate script correctness efficiently, we further propose a two-step verification framework consisting of static syntax verification and LLM-based functional evaluation. On our benchmark, iScript shows higher pass@k scores than currently state-of-the-art LLMs on average. These results demonstrate the effectiveness of domain adaptation and data synthesis for EDA scripting tasks.
翻译:现代电子设计自动化流程高度依赖Tcl脚本,然而通用大语言模型在该领域表现不佳,原因包括极端数据稀缺、领域特定语义以及物理设计所需的高可靠性要求。本文提出iScript——一个面向Innovus Tcl脚本生成的领域自适应Qwen3-8B模型,以及涵盖五个任务类别和三个难度等级的综合基准测试iScript-Bench。为克服训练数据匮乏的难题,我们设计了集成命令提取、静态检查、需求反推和思维链生成的多阶段数据合成流程,构建了包含10K组(需求描述,思维链,脚本)的数据集。iScript通过结合领域自适应预训练与监督微调的两阶段策略进行训练。为高效评估脚本正确性,我们进一步提出包含静态语法验证与基于大语言模型功能评估的两步验证框架。在我们的基准测试中,iScript的平均通过率(pass@k)优于当前最先进的大语言模型。这些结果证明了领域自适应与数据合成方法在电子设计自动化脚本任务中的有效性。