Persistent memory (PMEM) devices present an opportunity to retain the flexibility of main memory data structures and algorithms, but augment them with reliability and persistence. The challenge in doing this is to combine replication (for reliability) and failure atomicity (for persistence) with concurrency (for fully utilizing persistent memory bandwidth). These requirements are at odds due to the sequential nature of replicating a log of updates versus concurrent updates that are necessary for fully leveraging the path from CPU to memory. We present Blizzard -- a fault-tolerant, PMEM-optimized persistent programming runtime. Blizzard addresses the fundamental tradeoff by combining (1) a coupled operations log that permits tight integration of a PMEM-specialized user-level replication stack with a PMEM-based persistence stack, and (2) explicit control over the commutativity among concurrent operations. We demonstrate the generality and potential of Blizzard with three illustrative applications with very different data structure requirements for their persistent state. These use cases demonstrate that with Blizzard, PMEM native data structures can deliver up to 3.6x performance benefit over the alternative purpose-build persistent application runtimes, while being simpler and safer (by providing failure atomicity and replication).
翻译:摘要:持久内存设备提供了一种保留主存数据结构与算法灵活性的机遇,并通过增强其可靠性与持久性来扩展其功能。实现这一目标面临的挑战在于,需将复制(用于可靠性)与失效原子性(用于持久性)同并发操作(用于充分利用持久内存带宽)相结合。由于更新日志的复制本质上是顺序的,而充分利用CPU到内存路径的并发更新是必要的,这两种需求相互矛盾。我们提出Blizzard——一种容错、面向持久内存优化的持久编程运行时系统。Blizzard通过以下方式解决这一根本性权衡:(1)耦合的操作日志,将面向持久内存特化的用户级复制栈与基于持久内存的持久化栈紧密集成;(2)对并发操作间的可交换性进行显式控制。我们通过三个具有截然不同持久状态数据结构需求的示例应用,展示了Blizzard的通用性与潜力。这些用例表明,使用Blizzard,原生持久内存数据结构相比替代性专用持久应用运行时系统,可提供高达3.6倍的性能优势,同时因具备失效原子性与复制能力而更为简单安全。