RTL simulation on CPUs remains a persistent bottleneck in hardware design. State-of-the-art simulators embed the circuit directly into the simulation binary, resulting in long compilation times and execution that is fundamentally CPU frontend-bound, with severe instruction-cache pressure. This work proposes RTeAAL Sim, which reformulates RTL simulation as a sparse tensor algebra problem. By representing RTL circuits as tensors and simulation as a sparse tensor algebra kernel, RTeAAL Sim decouples simulation behavior from binary size and makes RTL simulation amenable to well-studied tensor algebra optimizations. We demonstrate that a prototype of our tensor-based simulator, even with a subset of these optimizations, already mitigates the compilation overhead and frontend pressure and achieves performance competitive with the highly optimized Verilator simulator across multiple CPUs and ISAs.
翻译:在硬件设计中,基于CPU的RTL仿真仍然是一个长期存在的瓶颈。最先进的仿真器将电路直接嵌入仿真二进制文件中,导致编译时间漫长,且其执行从根本上受限于CPU前端,并承受严重的指令缓存压力。本文提出RTeAAL Sim,它将RTL仿真重新表述为一个稀疏张量代数问题。通过将RTL电路表示为张量,并将仿真视为稀疏张量代数内核,RTeAAL Sim将仿真行为与二进制文件大小解耦,并使RTL仿真能够应用经过深入研究的张量代数优化技术。我们证明,即使仅应用了这些优化技术的一个子集,我们基于张量的仿真器原型已经能够缓解编译开销和前端压力,并在多种CPU和ISA上实现了与高度优化的Verilator仿真器相竞争的性能。