The growing volume of graph data may exhaust the main memory. It is crucial to design a disk-based graph storage system to ingest updates and analyze graphs efficiently. However, existing dynamic graph storage systems suffer from read or write amplification and face the challenge of optimizing both read and write performance simultaneously. To address this challenge, we propose LSMGraph, a novel dynamic graph storage system that combines the write-friendly LSM-tree and the read-friendly CSR. It leverages the multi-level structure of LSM-trees to optimize write performance while utilizing the compact CSR structures embedded in the LSM-trees to boost read performance. LSMGraph uses a new memory structure, MemGraph, to efficiently cache graph updates and uses a multi-level index to speed up reads within the multi-level structure. Furthermore, LSMGraph incorporates a vertex-grained version control mechanism to mitigate the impact of LSM-tree compaction on read performance and ensure the correctness of concurrent read and write operations. Our evaluation shows that LSMGraph significantly outperforms state-of-the-art (graph) storage systems on both graph update and graph analytical workloads.
翻译:图数据量的不断增长可能耗尽主存。设计一种基于磁盘的图存储系统以高效地接收更新并分析图至关重要。然而,现有的动态图存储系统存在读放大或写放大的问题,并面临同时优化读写性能的挑战。为应对这一挑战,我们提出了LSMGraph,一种新颖的动态图存储系统,它结合了写友好的LSM-tree和读友好的CSR。它利用LSM-tree的多级结构来优化写性能,同时利用嵌入在LSM-tree中的紧凑CSR结构来提升读性能。LSMGraph使用一种新的内存结构MemGraph来高效缓存图更新,并使用多级索引来加速多级结构内的读取。此外,LSMGraph引入了顶点粒度的版本控制机制,以减轻LSM-tree压缩对读性能的影响,并确保并发读写操作的正确性。我们的评估表明,在图更新和图分析工作负载上,LSMGraph均显著优于最先进的(图)存储系统。