The discrepancy between processor speed and memory system performance continues to limit the performance of many workloads. To address the issue, one effective and well studied technique is cache prefetching. Many prefetching designs have been proposed, with varying approaches and effectiveness. For example, SPP is a popular prefetcher that leverages confidence throttled recursion to speculate on the future path of program's references, however it is very susceptible to the reference reordering of higher-level caches and the out-of-order core. Orthogonally, AMPM is another popular approach to prefetching which uses reordering-resistant access maps to identify patterns within a region, but is unable to speculate beyond that region. In this paper, we propose SPPAM, a new approach to prefetching, inspired by prior works such as SPP and AMPM, while addressing their limitations. SPPAM utilizes online-learning to build a set of access-map patterns. These patterns are used in a speculative lookahead which is throttled by a confidence metric. Targeting the second-level cache, SPPAM alongside state-of-the-art prefetchers Berti and Bingo improves system performance by 31.4% over no prefetching and 6.2% over the baseline of Berti and Pythia.
翻译:处理器速度与内存系统性能之间的差距持续限制着许多工作负载的性能表现。为应对这一问题,缓存预取是一种经过深入研究且行之有效的技术。目前已提出多种预取设计方案,其方法与效能各有不同。例如,SPP是一种广泛使用的预取器,它利用置信度节流递归机制来推测程序引用的未来路径,但该方法极易受到高层级缓存乱序引用及乱序核心执行的影响。正交地,AMPM是另一种主流预取方案,它采用抗乱序的访问映射来识别区域内的访问模式,但无法对该区域之外进行推测。本文提出SPPAM这一新型预取方法,其设计灵感来源于SPP和AMPM等先前工作,同时解决了它们的局限性。SPPAM通过在线学习构建一组访问映射模式,并采用置信度指标进行节流控制,基于这些模式进行推测性前瞻预取。针对二级缓存场景,SPPAM与先进预取器Berti和Bingo协同工作时,系统性能相比无预取方案提升31.4%,较Berti与Pythia的基准组合提升6.2%。