The multi-level design of Log-Structured Merge-trees (LSM-trees) naturally fits the tiered storage architecture: the upper levels (recently inserted/updated records) are kept in fast storage to guarantee performance while the lower levels (the majority of records) are placed in slower but cheaper storage to reduce cost. However, frequently accessed records may have been compacted and reside in slow storage, and existing algorithms are inefficient in promoting these ``hot'' records to fast storage, leading to compromised read performance. We present HotRAP, a key-value store based on RocksDB that can timely promote hot records individually from slow to fast storage and keep them in fast storage while they are hot. HotRAP uses an on-disk data structure (a specially-made LSM-tree) to track the hotness of keys and includes three pathways to ensure that hot records reach fast storage with short delays. Our experiments show that HotRAP outperforms state-of-the-art LSM-trees on tiered storage by up to 5.6$\times$ compared to the second best under read-only and read-write-balanced YCSB workloads with common access skew patterns, and by up to 2.0$\times$ compared to the second best under Twitter production workloads.
翻译:日志结构合并树(LSM树)的多层设计天然契合分层存储架构:上层(近期插入/更新的记录)保存在高速存储中以保障性能,而下层(绝大多数记录)则置于速度较慢但成本更低的存储中以降低开销。然而,频繁访问的记录可能因压实操作而驻留在慢速存储中,现有算法在将这些“热”记录提升至快速存储方面效率低下,导致读取性能受损。本文提出HotRAP——一种基于RocksDB的键值存储系统,能够将热记录及时从慢速存储单独提升至快速存储,并在其保持热度期间持续驻留于快速存储。HotRAP采用一种磁盘数据结构(特制的LSM树)追踪键的热度,并通过三条路径确保热记录以低延迟抵达快速存储。实验表明:在具有常见访问偏斜模式的只读及读写均衡YCSB负载下,HotRAP较次优方案性能提升最高达5.6倍;在Twitter生产负载下,较次优方案性能提升最高达2.0倍。