This paper explores the potential of communicating information gained by static analysis from compilers to Out-of-Order (OoO) machines, focusing on the memory dependence predictor (MDP). The MDP enables loads to issue without all in-flight store addresses being known, with minimal memory order violations. We use LLVM to find loads with no dependencies and label them via their opcode. These labelled loads skip making lookups into the MDP, improving prediction accuracy by reducing false dependencies. We communicate this information in a minimally intrusive way, i.e.~without introducing additional hardware costs or instruction bandwidth, providing these improvements without any additional overhead in the CPU. We find that in select cases in Spec2017, a significant number of load instructions can skip interacting with the MDP and lead to a performance gain. These results point to greater possibilities for static analysis as a source of near zero cost performance gains in future CPU designs.
翻译:本文探讨了将编译器通过静态分析获得的信息传递给乱序执行(Out-of-Order, OoO)机器的潜力,重点关注内存依赖预测器(Memory Dependence Predictor, MDP)。MDP使加载指令能够在未获知所有正在执行中的存储地址的情况下发出,同时最小化内存顺序违规。我们利用LLVM找出无依赖关系的加载指令,并通过其操作码对这些指令进行标记。这些被标记的加载指令跳过对MDP的查询,从而通过减少虚假依赖关系来提高预测准确性。我们以最小侵入性的方式传递该信息,即在不引入额外硬件成本或指令带宽的情况下,实现这些改进而无需CPU承担任何额外开销。我们发现,在Spec2017的特定案例中,大量加载指令可以跳过与MDP的交互,从而带来性能提升。这些结果表明,在未来的CPU设计中,静态分析作为一种近乎零成本的性能提升来源具有更大的潜力。