Driven by a rapid co-evolution of both harness and underlying models, LLM agents are improving at a dizzying pace. In our prior work (performed in Dec. 2025), we introduced "Design Conductor" (or just "Conductor"), a system capable of building a 5-stage Linux-capable RISC-V CPU in 12 hours. In this work, we introduce an updated multi-agent harness powered by frontier models released in April 2026, which is able to handle 80x larger tasks, at higher quality, fully autonomously. Following a brief introduction, we examine 4 designs that the system produced autonomously, including "VerTQ", an LLM inference accelerator which hard-wires support for TurboQuant in a 240-cycle pipeline, starting from the TurboQuant arXiv paper. VerTQ includes heavy compute processing, with 5129 FP16/32 units; the design was mapped to an FPGA at 125 MHz and consumes 5.7 mm^2 in TSMC 16FF (8 attention pipes). We review the key new characteristics that enabled these results. Finally, we analyze Design Conductor's token usage and other empirical characteristics, including its limitations.
翻译:受工具框架与底层模型快速协同演进的驱动,大语言模型智能体正以令人炫目的速度持续改进。在我们先前的工作(2025年12月完成)中,提出了"Design Conductor"(简称"Conductor")——一个能在12小时内构建支持Linux的五级流水线RISC-V CPU的系统。在本工作中,我们介绍了一个由2026年4月发布的前沿模型驱动的升级版多智能体框架,其能够全自动地处理规模大80倍的任务,且质量更高。通过简要介绍后,我们考察了该系统自主生成的四个设计,其中包括"VerTQ"——一个基于TurboQuant arXiv论文、在240周期流水线中硬连线支持TurboQuant的大语言模型推理加速器。VerTQ包含大量计算处理单元,拥有5129个FP16/32单元;该设计已在125 MHz频率下映射至FPGA,并在TSMC 16FF工艺下占用5.7 mm²面积(含8个注意力管道)。我们回顾了促成这些成果的关键新特性。最后,我们分析了Design Conductor的代币使用情况及其他经验性特征,包括其局限性。