Mitigating the Memory Bottleneck with Machine Learning-Driven and Data-Aware Microarchitectural Techniques

Modern applications process massive data volumes that overwhelm the storage and retrieval capabilities of memory systems, making memory the primary performance and energy-efficiency bottleneck of computing systems. Although many microarchitectural techniques attempt to hide or tolerate long memory access latency, rapidly growing data footprints continue to outpace technology scaling, requiring more effective solutions. This dissertation shows that modern processors observe large amounts of application and system data during execution, yet many microarchitectural mechanisms make decisions largely independent of this information. Through four case studies, we demonstrate that such data-agnostic design leads to substantial missed opportunities for improving performance and energy efficiency. To address this limitation, this dissertation advocates shifting microarchitecture design from data-agnostic to data-informed. We propose mechanisms that (1) learn policies from observed execution behavior (data-driven design) and (2) exploit semantic characteristics of application data (data-aware design). We apply lightweight machine learning techniques and previously underexplored data characteristics across four processor components: a reinforcement learning-based hardware data prefetcher that learns memory access patterns online; a perceptron predictor that identifies memory requests likely to access off-chip memory; a reinforcement learning mechanism that coordinates data prefetching and off-chip prediction; and a mechanism that exploits repeatability in memory addresses and loaded values to eliminate predictable load instructions. Our extensive evaluation shows that the proposed techniques significantly improve performance and energy efficiency compared to prior state-of-the-art approaches.

翻译：现代应用程序处理的海量数据规模已超出内存系统的存储与检索能力，使内存成为计算系统性能与能效的主要瓶颈。尽管众多微架构技术试图隐藏或容忍较长的内存访问延迟，但数据规模的快速增长持续超越技术缩放速度，亟需更有效的解决方案。本学位论文指出，现代处理器在执行过程中可观测到大量应用与系统数据，然而多数微架构机制在决策时基本独立于这些信息。通过四项案例研究，我们证明这种数据无关的设计方式会错失大量提升性能与能效的机会。为突破此局限，本文主张将微架构设计从数据无关转向数据感知。我们提出两种机制：(1) 从观测到的执行行为中学习策略（数据驱动设计）；(2) 利用应用数据的语义特征（数据感知设计）。我们在四个处理器组件中应用轻量级机器学习技术与先前未被充分探索的数据特征：基于强化学习的硬件数据预取器，可在线学习内存访问模式；感知器预测器，用于识别可能访问片外内存的请求；协调数据预取与片外预测的强化学习机制；以及利用内存地址与加载值重复性来消除可预测加载指令的机制。大量实验评估表明，相较于现有先进方法，所提技术能显著提升性能与能效。