ISAAC: Intelligent, Scalable, Agile, and Accelerated CPU Verification via LLM-aided FPGA Parallelism

Functional verification is a critical bottleneck in integrated circuit development, with CPU verification being especially time-intensive and labour-consuming. Industrial practice relies on differential testing for CPU verification, yet faces bottlenecks at nearly each stage of the framework pipeline: front-end stimulus generation lacks micro-architectural awareness, yielding low-quality and redundant tests that impede coverage closure and miss corner cases. Meanwhile, back-end simulation infrastructure, even with FPGA acceleration, often stalls on long-running tests and offers limited visibility, delaying feedback and prolonging the debugging cycle. Here, we present ISAAC, a full-stack, Large Language Model (LLM)-aided CPU verification framework with FPGA parallelism, from bug categorisation and stimulus generation to simulation infrastructure. To do so, we presented a multi-agent stimulus engine in ISAAC's front-end, infused with micro-architectural knowledge and historical bug patterns, generating highly targeted tests that rapidly achieve coverage goals and capture elusive corner cases. In ISAAC's back-end, we introduce a lightweight forward-snapshot mechanism and a decoupled co-simulation architecture between the Instruction Set Simulator (ISS) and the Design Under Test (DUT), enabling a single ISS to drive multiple DUTs in parallel. By eliminating long-tail test bottlenecks and exploiting FPGA parallelism, the simulation throughput is significantly improved. As a demonstration, we used ISAAC to verify a mature CPU that has undergone multiple successful tape-outs. Results show up to 17,536x speed-up over software RTL simulation, while detecting several previously unknown bugs, two of which are reported in this paper.

翻译：功能验证是集成电路开发中的关键瓶颈，CPU验证尤其耗时耗力。工业实践依赖差分测试进行CPU验证，但在框架流水线的几乎每个阶段都面临瓶颈：前端激励生成缺乏微架构感知，产生低质量冗余测试，阻碍覆盖率收敛并遗漏边界情况。同时，后端仿真基础设施即使采用FPGA加速，仍常因长时测试停滞且可见性有限，导致反馈延迟并延长调试周期。本文提出ISAAC——一个全栈式、基于大型语言模型（LLM）辅助FPGA并行化的CPU验证框架，涵盖从缺陷分类、激励生成到仿真基础设施的完整流程。为此，我们在ISAAC前端设计了融合微架构知识与历史缺陷模式的多智能体激励引擎，生成高针对性测试用例，快速达成覆盖率目标并捕获隐蔽边界情况。在ISAAC后端，我们引入轻量级前向快照机制及指令集仿真器（ISS）与待测设计（DUT）间的解耦协同仿真架构，使单个ISS能并行驱动多个DUT。通过消除长尾测试瓶颈并充分利用FPGA并行性，仿真吞吐量得到显著提升。作为验证，我们使用ISAAC对一个经历多次成功流片的成熟CPU进行验证。实验结果显示，相较于软件RTL仿真获得最高17,536倍的加速比，同时检测到多个先前未知的缺陷，本文报告其中两例。