The Compute Express Link (CXL) interconnect has provided the ability to integrate diverse memory types into servers via byte-addressable SerDes links. Harnessing the full potential of such heterogeneous memory systems requires efficient memory tiering. However, existing research in this domain has been constrained by low-resolution and high-overhead memory access profiling techniques. To address this critical challenge, we propose to enhance existing memory tiering systems with a novel NeoMem solution. NeoMem offloads memory profiling functions to device-side controllers, integrating a dedicated hardware unit called NeoProf. NeoProf readily tracks memory access and provides the operating system with crucial page hotness statistics and other useful system state information. On the OS kernel side, we introduce a revamped memory-tiering strategy, enabling accurate and timely hot page promotion based on NeoProf statistics. We implement NeoMem on a real CXL-enabled FPGA platform and Linux kernel v6.3. Comprehensive evaluations demonstrate that NeoMem achieves 32% to 67% geomean speedup over several existing memory tiering solutions.
翻译:计算快速链接(CXL)互连技术通过字节可寻址的SerDes链路,实现了将多种类型内存集成至服务器的能力。充分发挥此类异构内存系统的潜力需要高效的内存分层技术。然而,现有研究受限于低分辨率、高开销的内存访问剖析技术。为应对这一关键挑战,我们提出通过新型NeoMem解决方案增强现有内存分层系统。NeoMem将内存剖析功能卸载至设备侧控制器,集成名为NeoProf的专用硬件单元。NeoProf可实时追踪内存访问,并向操作系统提供关键的页面热度统计信息及其他有用的系统状态数据。在操作系统内核侧,我们引入改进的内存分层策略,基于NeoProf统计数据实现准确及时的热页面提升。我们在真实CXL支持的FPGA平台及Linux内核v6.3上实现了NeoMem。综合评估表明,与多种现有内存分层解决方案相比,NeoMem可实现32%至67%的几何平均加速比。