This paper presents ARCS (Autoregressive Circuit Synthesis), a system for amortized analog circuit generation. ARCS produces complete, SPICE-simulatable designs (topology and component values) in milliseconds rather than the minutes required by search-based methods. A hybrid pipeline combines two learned generators, a graph VAE and a flow-matching model, with SPICE-based ranking. It achieves 99.9% simulation validity (reward 6.43/8.0) across 32 topologies using only 8 SPICE evaluations, 40x fewer than genetic algorithms. For single-model inference, a topology-aware Graph Transformer with Best-of-3 candidate selection reaches 85% simulation validity in 97ms, over 600x faster than random search. The key technical contribution adapts Group Relative Policy Optimization (GRPO) to multi-topology circuit reinforcement learning. GRPO resolves a critical failure mode of REINFORCE, cross-topology reward distribution mismatch, through per-topology advantage normalization. This improves simulation validity by +9.6 percentage points over REINFORCE in only 500 RL steps (10x fewer). Grammar-constrained decoding additionally guarantees 100% structural validity by construction via topology-aware token masking.
翻译:本文提出ARCS(自回归电路合成系统),一种用于摊销式模拟电路生成的系统。ARCS在毫秒级时间内即可生成完整的、可进行SPICE仿真的设计(包括拓扑结构及元件参数),而基于搜索的方法需要数分钟。混合流水线将两个学习生成器(图变分自编码器与流匹配模型)与基于SPICE的排序相结合。在仅使用8次SPICE评估(比遗传算法少40倍)的情况下,该系统在32种拓扑结构上实现了99.9%的仿真有效性(奖励值6.43/8.0)。对于单模型推理,采用拓扑感知图Transformer结合Best-of-3候选选择策略,在97毫秒内达到85%的仿真有效性,速度比随机搜索快600倍以上。关键技术贡献在于将群体相对策略优化(GRPO)适配到多拓扑电路强化学习。GRPO通过逐拓扑优势归一化,解决了REINFORCE算法的关键失效模式——跨拓扑奖励分布不匹配问题。仅需500次强化学习步骤(减少10倍),仿真有效性就比REINFORCE提高+9.6个百分点。此外,语法约束解码通过拓扑感知词元掩码机制,以构造方式保证100%的结构有效性。