The discrepancy between processor speed and memory system performance continues to limit the performance of many workloads. To address the issue, one effective and well studied technique is cache prefetching. Many prefetching designs have been proposed, with varying approaches and effectiveness. For example, SPP is a popular prefetcher that leverages confidence throttled recursion to speculate on the future path of program's references, however it is very susceptible to the reference reordering of higher-level caches and the OoO core. Orthogonally, AMPM is another popular approach to prefetching which uses reordering-resistant access maps to identify patterns within a region, but is unable to speculate beyond that region. In this paper, we propose SPPAM, a new approach to prefetching, inspired by prior works such as SPP and AMPM, while addressing their limitations. SPPAM utilizes online-learning to build a set of access-map patterns. These patterns are used in a speculative lookahead which is throttled by a confidence metric. Targeting the second-level cache, SPPAM alongside state-of-the-art prefetchers Berti and Bingo improve system performance by 31.4% over no prefetching and 6.2% over the baseline of Berti and Pythia.
翻译:处理器速度与内存系统性能之间的差距持续制约着许多工作负载的性能表现。为解决这一问题,缓存预取是一种经过深入研究且行之有效的技术。目前已提出多种具有不同实现方式和效果的预取设计方案。例如,SPP是一种广泛使用的预取器,它通过置信度节流递归机制来推测程序引用的未来路径,但该方法极易受到高层级缓存乱序引用及乱序执行核心的影响。与之正交地,AMPM是另一种主流预取方法,它采用抗乱序的访问映射来识别区域内的访问模式,但无法对该区域之外的范围进行推测。本文受SPP和AMPM等现有研究的启发,提出一种新型预取方法SPPAM,在继承其优势的同时解决了原有局限。SPPAM通过在线学习构建一组访问映射模式,并基于置信度指标进行节流控制的推测性前瞻预取。针对二级缓存场景的实验表明,SPPAM与前沿预取器Berti和Bingo协同工作时,系统性能较无预取方案提升31.4%,较Berti与Pythia的基准组合提升6.2%。