Branch misprediction latency is one of the most important contributors to performance degradation and wasted energy consumption in a modern core. State-of-the-art predictors generally perform very well but occasionally suffer from high Misprediction Per Kilo Instruction due to hard-to-predict branches. In this work, we investigate if predicting branches using microarchitectural information, in addition to traditional branch history, can improve prediction accuracy. Our approach considers branch timing information (resolution cycle) both for older branches in the Reorder Buffer (ROB) and recently committed, and for younger branches relative to the branch we re-predict. We propose Speculative Branch Resolution (SBR) in which, N cycles after a branch allocates in the ROB, various timing information is collected and used to re-predict. Using the gem5 simulator we implement and perform a limit-study of SBR using a TAGE-Like predictor. Our experiments show that the post-alloc timing information we used was not able to yield performance gains over an unbounded TAGE-SC. However, we find two hard to predict branches where timing information did provide an advantage and thoroughly analysed one of them to understand why. This finding suggests that predictors may benefit from specific microarchitectural information to increase accuracy on specific hard to predict branches and that overriding predictions in the backend may yet yield performance benefits, but that further research is needed to determine such information vectors.
翻译:分支误预测延迟是现代处理器核心性能下降和能耗浪费的最重要因素之一。现有最先进的预测器通常表现优异,但偶尔会因难以预测的分支而出现较高的每千条指令误预测率。本研究探讨在传统分支历史信息基础上,结合微架构信息是否能够提升分支预测准确率。我们提出的方法同时考虑了重排序缓冲区中较老分支与近期提交分支的时序信息(解析周期),以及相对于待重新预测分支的较年轻分支时序信息。我们提出推测性分支解析方法,该方法在分支进入重排序缓冲区N个周期后,收集各类时序信息并用于重新预测。通过gem5模拟器,我们采用类TAGE预测器实现并对SBR进行了极限研究。实验表明,我们所使用的分配后时序信息未能超越无约束TAGE-SC预测器的性能表现。然而,我们发现了两个时序信息确实带来优势的难预测分支,并对其一进行了深入分析以探究原因。这一发现表明,预测器可能受益于特定的微架构信息来提高对特定难预测分支的准确率,在后端覆盖预测结果仍可能带来性能收益,但需要进一步研究来确定此类信息向量。