In this paper, we argue that current work has failed to provide a comprehensive and maintainable in-memory representation for persistent memory. PM data should be easily mappable into a process address space, shareable across processes, shippable between machines, consistent after a crash, and accessible to legacy code with fast, efficient pointers as first-class abstractions. While existing systems have provided niceties like mmap()-based load/store access, they have not been able to support all these necessary properties due to conflicting requirements. We propose Puddles, a new persistent memory abstraction, to solve these problems. Puddles provide application-independent recovery after a power outage; they make recovery from a system failure a system-level property of the stored data rather than the responsibility of the programs that access it. Puddles use native pointers, so they are compatible with existing code. Finally, Puddles implement support for sharing and shipping of PM data between processes and systems without expensive serialization and deserialization. Compared to existing systems, Puddles are at least as fast as and up to 1.34$\times$ faster than PMDK while being competitive with other PM libraries across YCSB workloads. Moreover, to demonstrate Puddles' ability to relocate data, we showcase a sensor network data-aggregation workload that results in a 4.7$\times$ speedup over PMDK.
翻译:本文指出,现有工作在为持久内存提供全面且可维护的内存内表示方面存在不足。持久内存数据应能轻松映射到进程地址空间、跨进程共享、跨机器传输、崩溃后保持一致性,并通过快速高效的指针以一级抽象形式供遗留代码访问。尽管现有系统已提供基于mmap()的加载/存储访问等便利功能,但由于需求冲突,未能同时支持所有这些必要特性。为此,我们提出新型持久内存抽象Puddles以解决这些问题。Puddles在断电后提供应用无关的恢复,使系统故障恢复成为存储数据的系统级属性,而非由访问数据的程序负责。Puddles使用原生指针,可与现有代码兼容。此外,Puddles实现了持久内存数据在进程与系统间共享与传输的支持,无需昂贵的序列化与反序列化操作。与现有系统相比,Puddles在YCSB工作负载下速度至少与PMDK相当,最高可达1.34倍提升,同时与其他持久内存库性能相当。为验证Puddles的数据迁移能力,我们以传感器网络数据聚合工作负载为例,其性能较PMDK提升4.7倍。