Many applications require update-intensive workloads on spatial objects, e.g., social-network services and shared-riding services that track moving objects. By buffering insert and delete operations in memory, the Log Structured Merge Tree (LSM) has been used widely in various systems because of its ability to handle write-heavy workloads. While the focus on LSM has been on key-value stores and their optimizations, there is a need to study how to efficiently support LSM-based {\em secondary} indexes (e.g., location-based indexes) as modern, heterogeneous data necessitates the use of secondary indexes. In this paper, we investigate the augmentation of a main-memory-based memo structure into an LSM secondary index structure to handle update-intensive workloads efficiently. We conduct this study in the context of an R-tree-based secondary index. In particular, we introduce the LSM RUM-tree that demonstrates the use of an Update Memo in an LSM-based R-tree to enhance the performance of the R-tree's insert, delete, update, and search operations. The LSM RUM-tree introduces new strategies to control the size of the Update Memo to make sure it always fits in memory for high performance. The Update Memo is a light-weight in-memory structure that is suitable for handling update-intensive workloads without introducing significant overhead. Experimental results using real spatial data demonstrate that the LSM RUM-tree achieves up to 9.6x speedup on update operations and up to 2400x speedup on query processing over existing LSM R-tree implementations.
翻译:许多应用需要处理空间对象的高更新率负载,例如追踪移动对象的社交网络服务和共享出行服务。由于具备处理写密集型负载的能力,日志结构合并树(LSM)通过将插入和删除操作缓存在内存中,已被广泛用于各类系统。尽管现有研究主要聚焦于LSM在键值存储及其优化中的应用,但现代异构数据对辅助索引(如基于位置的索引)的需求,促使我们探究如何高效支持基于LSM的辅助索引。本文研究将基于内存的备忘录结构增强至LSM辅助索引框架中,以高效处理高更新率负载。我们以基于R树的辅助索引为背景开展研究,具体提出了LSM RUM-tree,通过将更新备忘录(Update Memo)融入基于LSM的R树,提升R树插入、删除、更新和查询操作的性能。LSM RUM-tree引入新策略控制更新备忘录的规模,确保其始终驻留内存以实现高性能。该更新备忘录是一种轻量级内存结构,适用于高更新率负载而不引入显著开销。基于真实空间数据的实验表明,相较于现有LSM R树实现,LSM RUM-tree的更新操作加速比可达9.6倍,查询处理加速比可达2400倍。