While functional RISC-V implementations are readily available in academia, controlled empirical studies that extend a single baseline architecture along multiple design axes and quantify the resulting trade-offs at each step remain scarce. This paper presents RV-IM100, a family of 10 incremental FPGA-implemented microarchitectures derived from a common 5-stage pipeline baseline, systematically varying datapath width from RV32 to RV64, instruction set from I to IM, and pipeline depth from 5 to 8 stages under controlled conditions. The I-to-IM extension produced strongly benchmark-dependent effects at the 5-stage level: CoreMark throughput more than doubled while Dhrystone throughput decreased marginally despite improved per-MHz efficiency. Within the RV32IM configuration, an iterative timing-closure methodology combined with pipeline deepening from 5 to 8 stages raised max frequency from 43 to 126MHz, increasing both Dhrystone and CoreMark throughput by 71%, while per-MHz efficiency decreased by 41%. The 6-to-7-stage transition caused throughput regression in RV64 despite higher frequency, revealing that the outcome depends on available frequency headroom. Cross-width comparison showed RV32 outperforming RV64 in absolute throughput, with per-MHz efficiency diverging by benchmark: RV64 led by 2.3% in DMIPS/MHz while RV32 led by 4.6% in CoreMark/MHz. At 8 stages, RV32 required 59% fewer LUTs, 51% fewer FFs, and 80% fewer DSPs, indicating that the resource cost of width extension substantially exceeds the modest efficiency differences. These results provide a quantitative reference for design-space exploration in RISC-V microarchitectures. All RTL sources and benchmark configurations are publicly available.
翻译:暂无翻译