As a core component in modern data centers, key-value cache provides high-throughput and low-latency services for high-speed data processing. The effectiveness of a key-value cache relies on its ability of accommodating the needed data. However, expanding the cache capacity is often more difficult than commonly expected because of many practical constraints, such as server costs, cooling issues, rack space, and even human resource expenses. A potential solution is compression, which virtually extends the cache capacity by condensing data in cache. In practice, this seemingly simple idea has not gained much traction in key-value cache system design, due to several critical issues: the compression-unfriendly index structure, severe read/write amplification, wasteful decompression operations, and heavy computing cost. This paper presents a hybrid DRAM-SSD cache design to realize a systematic integration of data compression in key-value cache. By treating compression as an essential component, we have redesigned the indexing structure, data management, and leveraged the emerging computational SSD hardware for collaborative optimizations. We have developed a prototype, called ZipCache. Our experimental results show that ZipCache can achieve up to 72.4% higher throughput and 42.4% lower latency, while reducing the write amplification by up to 26.2 times.
翻译:作为现代数据中心的核心组件,键值缓存为高速数据处理提供高吞吐、低延迟的服务。键值缓存的有效性取决于其容纳所需数据的能力。然而,由于诸多实际限制(如服务器成本、散热问题、机架空间乃至人力资源开销),扩展缓存容量往往比通常预期的更为困难。一种潜在的解决方案是压缩技术,它通过压缩缓存中的数据来虚拟扩展缓存容量。在实践中,这一看似简单的想法在键值缓存系统设计中并未获得广泛应用,原因在于几个关键问题:不便于压缩的索引结构、严重的读写放大、浪费的解压缩操作以及高昂的计算开销。本文提出了一种DRAM-SSD混合缓存设计,旨在实现数据压缩在键值缓存中的系统性集成。通过将压缩视为核心组件,我们重新设计了索引结构和数据管理机制,并利用新兴的计算型SSD硬件进行协同优化。我们开发了一个名为ZipCache的原型系统。实验结果表明,ZipCache可实现高达72.4%的吞吐量提升和42.4%的延迟降低,同时将写放大最多减少26.2倍。