In-memory ordered key-value stores are an important building block in modern distributed applications. We present Honeycomb, a hybrid software-hardware system for accelerating read-dominated workloads on ordered key-value stores that provides linearizability for all operations including scans. Honeycomb stores a B-Tree in host memory, and executes SCAN and GET on an FPGA-based SmartNIC, and PUT, UPDATE and DELETE on the CPU. This approach enables large stores and simplifies the FPGA implementation but raises the challenge of data access and synchronization across the slow PCIe bus. We describe how Honeycomb overcomes this challenge with careful data structure design, caching, request parallelism with out-of-order request execution, wait-free read operations, and batching synchronization between the CPU and the FPGA. For read-heavy YCSB workloads, Honeycomb improves the throughput of a state-of-the-art ordered key-value store by at least 1.8x. For scan-heavy workloads inspired by cloud storage, Honeycomb improves throughput by more than 2x. The cost-performance, which is more important for large-scale deployments, is improved by at least 1.5x on these workloads.
翻译:内存有序键值存储是现代分布式应用中的关键构建模块。我们提出Honeycomb,一个混合软硬件系统,用于加速有序键值存储中读密集型工作负载,并为包括范围扫描在内的所有操作提供线性一致性。Honeycomb将B-Tree存储在主机内存中,在基于FPGA的智能网卡上执行SCAN和GET操作,而在CPU上执行PUT、UPDATE和DELETE操作。该方法支持大规模存储并简化FPGA实现,但面临通过慢速PCIe总线进行数据访问与同步的挑战。我们阐述了Honeycomb如何通过精细的数据结构设计、缓存机制、基于乱序请求执行的请求并行化、无等待读操作以及CPU与FPGA间的批量同步来克服这一挑战。对于读密集型YCSB工作负载,Honeycomb将最新有序键值存储的吞吐量提升至少1.8倍。对于受云存储启发的扫描密集型工作负载,Honeycomb将吞吐量提升超过2倍。在大规模部署中更为重要的性价比指标上,这些工作负载的改善至少达到1.5倍。