Architectural simulators play a critical role in early microarchitectural exploration due to their flexibility and high productivity. However, their effectiveness is often constrained by fidelity: simulators may deviate from the behavior of the final RTL, leading to unreliable performance estimates. Consequently, model calibration, which aligns simulator behavior with the RTL as the ground-truth microarchitecture, becomes essential for achieving accurate performance modeling. To facilitate model calibration accuracy, we propose Microarchitecture Cliffs, a benchmark generation methodology designed to expose mismatches in microarchitectural behavior between the simulator and RTL. After identifying the key architectural components that require calibration, the Cliff methodology enables precise attribution of microarchitectural differences to a single microarchitectural feature through a set of benchmarks. In addition, we develop a set of automated tools to improve the efficiency of the Cliff workflow. We apply the Cliff methodology to calibrate the XiangShan version of gem5 (XS-GEM5) against the XiangShan open-source CPU (XS-RTL). We reduce the performance error of XS-GEM5 from 59.2% to just 1.4% on the Cliff benchmarks. Meanwhile, the calibration guided by Cliffs effectively reduces the relative error of a representative tightly coupled microarchitectural feature by 48.03%. It also substantially lowers the absolute performance error, with reductions of 15.1% and 21.0% on SPECint2017 and SPECfp2017, respectively.
翻译:架构模拟器因其灵活性和高生产力,在早期微架构探索中发挥着关键作用。然而,其有效性常受限于保真度:模拟器可能与最终RTL的行为存在偏差,导致不可靠的性能估计。因此,模型校准——将模拟器行为与作为真实微架构的RTL对齐——对于实现精确的性能建模至关重要。为提升模型校准的准确性,我们提出了微架构悬崖,这是一种旨在暴露模拟器与RTL之间微架构行为失配的基准测试生成方法。在识别出需要校准的关键架构组件后,悬崖方法通过一组基准测试,能够将微架构差异精确地归因于单个微架构特征。此外,我们开发了一套自动化工具以提高悬崖工作流的效率。我们应用悬崖方法来校准香山版gem5(XS-GEM5)相对于香山开源CPU(XS-RTL)。在悬崖基准测试上,我们将XS-GEM5的性能误差从59.2%降低至仅1.4%。同时,由悬崖指导的校准有效地将代表性紧耦合微架构特征的相对误差降低了48.03%。它还显著降低了绝对性能误差,在SPECint2017和SPECfp2017上分别减少了15.1%和21.0%。