The multi-level design of Log-Structured Merge-trees (LSM-trees) naturally fits the tiered storage architecture: the upper levels (recently inserted/updated records) are kept in fast storage to guarantee performance while the lower levels (the majority of records) are placed in slower but cheaper storage to reduce cost. However, frequently accessed records may have been compacted and reside in slow storage, and existing algorithms are inefficient in promoting these ``hot'' records to fast storage, leading to compromised read performance. We present HotRAP, a key-value store based on RocksDB that can timely promote hot records individually from slow to fast storage and keep them in fast storage while they are hot. HotRAP uses an on-disk data structure (a specially-made LSM-tree) to track the hotness of keys and includes three pathways to ensure that hot records reach fast storage with short delays. Our experiments show that HotRAP outperforms state-of-the-art LSM-trees on tiered storage by up to 3.3$\times$ compared to the second best for read-only and read-write-balanced workloads with common access skew patterns.
翻译:日志结构合并树(LSM-tree)的多层设计天然契合分层存储架构:上层(近期插入/更新的记录)驻留于高速存储以保障性能,而下层(绝大多数记录)置于低速但廉价的存储以降低成本。然而,频繁访问的记录可能已被压缩至低速存储层,现有算法在将这些"热"记录提升至高速存储层时效率低下,导致读取性能受损。本文提出HotRAP——基于RocksDB的键值存储系统,能够实时将热记录从低速存储层逐条提升至高速存储层,并在其热度持续期间保持驻留。HotRAP采用磁盘驻留数据结构(特制LSM-tree)追踪键的热度,并通过三条路径确保热记录能在短时间内抵达高速存储层。实验表明,在常见访问倾斜模式下,针对只读与读写均衡负载,HotRAP在分层存储上的性能较次优方案最高提升3.3倍。