There is a proliferation of applications requiring the management of large-scale, evolving graphs under workloads with intensive graph updates and lookups. Driven by this challenge, we introduce Poly-LSM, a high-performance key-value storage engine for graphs with the following novel techniques: (1) Poly-LSM is embedded with a new design of graph-oriented LSM-tree structure that features a hybrid storage model for concisely and effectively storing graph data. (2) Poly-LSM utilizes an adaptive mechanism to handle edge insertions and deletions on graphs with optimized I/O efficiency. (3) Poly-LSM exploits the skewness of graph data to encode the key-value entries. Building upon this foundation, we further implement Aster, a robust and versatile graph database that supports Gremlin query language facilitating various graph applications. In our experiments, we compared Aster against several mainstream real-world graph databases. The results demonstrate that Aster outperforms all baseline graph databases, especially on large-scale graphs. Notably, on the billion-scale Twitter graph dataset, Aster achieves up to 17x throughput improvement compared to the best-performing baseline graph system.
翻译:随着需要管理大规模动态图的应用日益增多,这些应用通常伴随着密集的图更新与查询负载。为应对这一挑战,我们提出了Poly-LSM,一种面向图的高性能键值存储引擎,其包含以下创新技术:(1) Poly-LSM嵌入了一种新颖的面向图的LSM树结构设计,该结构采用混合存储模型,能够简洁高效地存储图数据。(2) Poly-LSM利用自适应机制处理图的边插入与删除操作,并优化了I/O效率。(3) Poly-LSM利用图数据的偏斜特性对键值条目进行编码。在此基础上,我们进一步实现了Aster——一个稳健且通用的图数据库,它支持Gremlin查询语言,能够促进各类图应用。在实验中,我们将Aster与多种主流现实世界图数据库进行了对比。结果表明,Aster的性能优于所有基线图数据库,尤其在大规模图上表现突出。值得注意的是,在十亿规模的Twitter图数据集上,Aster相比性能最佳的基线图系统,吞吐量最高可提升17倍。