Memory disaggregation is being considered as a strong alternative to traditional architecture to deal with the memory under-utilization in data centers. Disaggregated memory can adapt to dynamically changing memory requirements for the data center applications like data analytics, big data, etc., that require in-memory processing. However, such systems can face high remote memory access latency due to the interconnect speeds. In this paper, we explore a rack-scale disaggregated memory architecture and discuss the various design aspects. We design a trace-driven simulator that combines an event-based interconnect and a cycle-accurate memory simulator to evaluate the performance of disaggregated memory system at the rack scale. Our study shows that not only the interconnect but the contention in the remote memory queues also adds significantly to remote memory access latency. We introduces a memory allocation policy to reduce the latency compared to the conventional policies. We conduct experiments using various benchmarks with diverse memory access patterns. Our study shows encouraging results towards the rack-scale memory disaggregation and acceptable average memory access latency.
翻译:内存分离技术正被视为传统架构的有力替代方案,以解决数据中心内存利用率不足的问题。分离内存能够适应数据分析、大数据等需要内存处理的数据中心应用动态变化的内存需求。然而,此类系统可能因互连速度面临较高的远程内存访问延迟。本文探索了一种机架级分离内存架构,并讨论了其各个设计方面。我们设计了一个结合基于事件的互连模型与周期精确内存模拟器的迹线驱动模拟器,用于评估机架级分离内存系统的性能。研究表明,不仅是互连,远程内存队列中的争用也显著增加了远程内存访问延迟。我们提出了一种内存分配策略,以降低与传统策略相比的延迟。我们使用多种具有不同内存访问模式的基准程序进行实验。研究结果显示出机架级内存分离的可行性以及可接受的平均内存访问延迟。