Zoned Namespace (ZNS) defines a new abstraction for host software to flexibly manage storage in flash-based SSDs as append-only zones. It also provides a Zone Append primitive to further boost the write performance of ZNS SSDs by exploiting intra-zone parallelism. However, making Zone Append effective for reliable and scalable storage, in the form of a RAID array of multiple ZNS SSDs, is non-trivial, since Zone Append offloads address management to ZNS SSDs and requires hosts to specifically manage RAID stripes across multiple drives. We propose ZapRAID, a high-performance log-structured RAID system for ZNS SSDs by carefully exploiting Zone Append to achieve high write parallelism and lightweight stripe management. ZapRAID adopts a group-based data layout with a coarse-grained ordering across multiple groups of stripes, such that it can use small-size metadata for stripe management on a per-group basis under Zone Append. It further adopts hybrid data management to simultaneously achieve intra-zone and inter-zone parallelism through a careful combination of both Zone Write and Zone Append primitives. We implement ZapRAID as a user-space block device, and evaluate ZapRAID using microbenchmarks, trace-driven experiments, and real-application experiments. Our evaluation results show that ZapRAID achieves high write throughput and maintains high performance in normal reads, degraded reads, crash recovery, and full-drive recovery.
翻译:分区命名空间(Zoned Namespace,ZNS)为基于闪存的固态硬盘定义了一种新的抽象接口,使主机软件能够以仅追加写入的分区形式灵活管理存储。该接口同时提供Zone Append原语,通过利用分区内并行性进一步提升ZNS SSD的写入性能。然而,在由多个ZNS SSD构成的RAID阵列中,要使Zone Append有效实现可靠且可扩展的存储并非易事,因为Zone Append将地址管理卸载至ZNS SSD,并要求主机专门管理跨多块硬盘的RAID条带。本文提出ZapRAID——一种面向ZNS SSD的高性能日志结构RAID系统,通过精细利用Zone Append原语实现高写入并行度与轻量级条带管理。ZapRAID采用基于分组的数据布局,在多个条带组间实施粗粒度排序机制,从而在Zone Append约束下能够以组为单位使用小规模元数据进行条带管理。该系统进一步采用混合数据管理策略,通过精心组合Zone Write与Zone Append原语,同步实现分区内与分区间的并行化处理。我们将ZapRAID实现为用户空间块设备,并通过微基准测试、轨迹驱动实验及实际应用实验进行评估。实验结果表明,ZapRAID在实现高写入吞吐量的同时,在常规读取、降级读取、故障恢复及全盘恢复等场景中均能保持优异性能。