DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks

Data movement between the CPU and main memory is a first-order obstacle against improving performance, scalability, and energy efficiency in modern systems. Computer systems employ a range of techniques to reduce overheads tied to data movement, spanning from traditional mechanisms (e.g., deep multi-level cache hierarchies, aggressive hardware prefetchers) to emerging techniques such as Near-Data Processing (NDP), where some computation is moved close to memory. Our goal is to methodically identify potential sources of data movement over a broad set of applications and to comprehensively compare traditional compute-centric data movement mitigation techniques to more memory-centric techniques, thereby developing a rigorous understanding of the best techniques to mitigate each source of data movement. With this goal in mind, we perform the first large-scale characterization of a wide variety of applications, across a wide range of application domains, to identify fundamental program properties that lead to data movement to/from main memory. We develop the first systematic methodology to classify applications based on the sources contributing to data movement bottlenecks. From our large-scale characterization of 77K functions across 345 applications, we select 144 functions to form the first open-source benchmark suite (DAMOV) for main memory data movement studies. We select a diverse range of functions that (1) represent different types of data movement bottlenecks, and (2) come from a wide range of application domains. Using NDP as a case study, we identify new insights about the different data movement bottlenecks and use these insights to determine the most suitable data movement mitigation mechanism for a particular application. We open-source DAMOV and the complete source code for our new characterization methodology at https://github.com/CMU-SAFARI/DAMOV.

翻译：CPU与主存之间的数据移动是提升现代系统性能、可扩展性和能效的首要障碍。计算机系统采用一系列技术来降低与数据移动相关的开销，涵盖从传统机制（如深层多级缓存层级、激进的硬件预取器）到新兴技术（如近数据处理，NDP，其中部分计算被移至内存附近）。我们的目标是系统性地识别广泛应用程序中潜在的数据移动来源，并全面比较传统的以计算为中心的数据移动缓解技术与更以内存为中心的技术，从而深入理解缓解每种数据移动来源的最佳技术。基于此目标，我们首次对跨多个应用领域的多种应用程序进行大规模特征分析，以识别导致主存数据移动的基本程序属性。我们开发了首个系统化方法论，根据导致数据移动瓶颈的来源对应用程序进行分类。通过对345个应用程序中77K个函数的大规模特征分析，我们选取了144个函数，构建了首个面向主存数据移动研究的开源基准测试套件（DAMOV）。所选函数具有多样性，（1）代表不同类型的数据移动瓶颈，且（2）来自广泛的应用领域。以NDP为例，我们揭示了关于不同数据移动瓶颈的新见解，并利用这些见解为特定应用程序确定最合适的数据移动缓解机制。我们在https://github.com/CMU-SAFARI/DAMOV开源了DAMOV及我们新特征分析方法的完整源代码。