Due to the scaling problem of the DRAM technology, non-volatile memory devices, which are based on different principle of operation than DRAM, are now being intensively developed to expand the main memory of computers. Disaggregated memory is also drawing attention as an emerging technology to scale up the main memory. Although system software studies need to discuss management mechanisms for the new main memory designs incorporating such emerging memory systems, there are no feasible memory emulation mechanisms that efficiently work for large-scale, privileged programs such as operating systems and hypervisors. In this paper, we propose an FPGA-based main memory emulator for system software studies on new main memory systems. It can emulate the main memory incorporating multiple memory regions with different performance characteristics. For the address region of each memory device, it emulates the latencies, bandwidths and bit-flip error rates of read/write operations, respectively. The emulator is implemented at the hardware module of an off-the-self FPGA System-on-Chip board. Any privileged/unprivileged software programs running on its powerful 64-bit CPU cores can access emulated main memory devices at a practical speed through the exactly same interface as normal DRAM main memory. We confirmed that the emulator transparently worked for CPU cores and successfully changed the performance of a memory region according to given emulation parameters; for example, the latencies measured by CPU cores were exactly proportional to the latencies inserted by the emulator, involving the minimum overhead of approximately 240 ns. As a preliminary use case, we confirmed that the emulator allows us to change the bandwidth limit and the inserted latency individually for unmodified software programs, making discussions on latency sensitivity much easier.
翻译:由于DRAM技术的扩展问题,基于与DRAM不同工作原理的非易失性存储设备正被密集开发以扩展计算机主存。此外,解聚内存作为扩展主存的新兴技术也备受关注。尽管系统软件研究需要讨论结合此类新兴内存系统的新型主存设计的管理机制,但目前尚无可行且高效运行于操作系统、虚拟机监控器等大规模特权程序的内存模拟机制。本文提出了一种基于FPGA的主存模拟器,用于新型主存系统的系统软件研究。该模拟器可模拟包含多个具有不同性能特征内存区域的主存。针对每个存储设备的地址区域,它分别模拟读写操作的延迟、带宽和位翻转错误率。该模拟器在商用FPGA系统级芯片板的硬件模块上实现。运行于其强大64位CPU核心上的任何特权/非特权软件程序,均可通过与常规DRAM主存完全相同的接口,以实际速度访问模拟的主存设备。我们验证了该模拟器对CPU核心透明运行,并成功根据给定模拟参数改变了内存区域的性能:例如,CPU核心测得的延迟与模拟器插入的延迟精确成比例,且引入的最小开销约为240纳秒。作为初步用例,我们确认该模拟器允许对未经修改的软件程序独立更改带宽限制和插入延迟,从而大幅简化延迟敏感性的讨论。