The direction of conditional branches is predicted correctly in modern processors with great accuracy. We find several instructions in the dynamic instruction stream that contribute only towards computing the condition of these instructions. Hence, when the predicted direction of conditional branches is indeed correct, these instructions become Ineffectual - the functional state of the program would not be different had these instructions been dropped. However, the execution of ineffectual instructions cannot be avoided altogether because it is possible that the prediction of the branch direction is wrong. In this work, we determine all sources of ineffectuality in an instruction stream such as conditional branches, predicated instructions, indirect jumps and dynamically dead instructions. Then, we propose a technique to steer the ineffectual instructions away from the primary execution cluster so that effectual instructions can execute uncontended. We find that such ineffectuality-based clustering of instructions naturally simplifies the design and avoids several caveats of a clustered architecture. Finally, we propose a technique to detect instances when instructions were incorrectly marked as ineffectual, say due to a branch misprediction, and recover the pipeline. The empirical evaluation of the proposed changes on the SPEC CPU2017 and GAPBS benchmarks show performance uplifts of up to 4.9% and 10.3% on average respectively.
翻译:现代处理器中以极高的准确度正确预测条件分支的方向。我们在动态指令流中发现若干指令仅用于计算这些分支指令的条件。因此,当条件分支的预测方向确实正确时,这些指令便成为无效指令——即使丢弃这些指令,程序的功能状态也不会发生改变。然而,由于分支方向预测可能出现错误,实际执行过程中无法完全避免无效指令的执行。本研究确定了指令流中所有无效性的来源,包括条件分支、断言指令、间接跳转及动态死指令。随后提出一种技术,将无效指令引导至主执行集群之外,使有效指令能在无竞争条件下执行。研究发现,这种基于无效性的指令聚类方法能够自然简化设计,并规避集群架构的若干缺陷。最后,我们提出一种检测机制,用于识别因分支预测错误等原因被错误标记为无效的指令实例,并实现流水线恢复。基于SPEC CPU2017和GAPBS基准测试的实验评估表明,该方案分别带来最高4.9%和平均10.3%的性能提升。