GraphScope Flex: LEGO-like Graph Computing Stack

Tao He,Shuxian Hu,Longbin Lai,Dongze Li,Neng Li,Xue Li,Lexiao Liu,Xiaojian Luo,Binqing Lyu,Ke Meng,Sijie Shen,Li Su,Lei Wang,Jingbo Xu,Wenyuan Yu,Weibin Zeng,Lei Zhang,Siyuan Zhang,Jingren Zhou,Xiaoli Zhou,Diwen Zhu

Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained widespread adoption across various industries. However, one key lesson from this journey has been understanding the limitations of a "one-size-fits-all" approach, especially when dealing with the diversity of programming interfaces, applications, and data storage formats in graph computing. In response to these challenges, we present GraphScope Flex, the next iteration of GraphScope. GraphScope Flex is designed to be both resource-efficient and cost-effective, while also providing flexibility and user-friendliness through its LEGO-like modularity. This paper explores the architectural innovations and fundamental design principles of GraphScope Flex, all of which are direct outcomes of the lessons learned during our ongoing development process. We validate the adaptability and efficiency of GraphScope Flex with extensive evaluations on synthetic and real-world datasets. The results show that GraphScope Flex achieves 2.4X throughput and up to 55.7X speedup over other systems on the LDBC Social Network and Graphalytics benchmarks, respectively. Furthermore, GraphScope Flex accomplishes up to a 2,400X performance gain in real-world applications, demonstrating its proficiency across a wide range of graph computing scenarios with increased effectiveness.

翻译：图计算在处理大规模图数据中日益关键，为此已开发出众多系统。两年前，我们推出了GraphScope系统，该系统在一个方案中统一解决了图遍历、图分析和图学习等广泛的图计算需求。自诞生以来，GraphScope取得了显著的技术进步，并在各行业获得广泛应用。然而，这段历程中的一个关键教训是，我们认识到“一刀切”方法的局限性，尤其是在应对图计算中编程接口、应用场景和数据存储格式的多样性时。针对这些挑战，我们提出了GraphScope的下一代版本——GraphScope Flex。GraphScope Flex旨在实现资源高效和成本优化，同时凭借其乐高式的模块化设计提供灵活性和易用性。本文探讨了GraphScope Flex的架构创新和基本设计原则，这些原则均直接源于我们持续开发过程中汲取的经验教训。我们通过在合成数据集和真实世界数据集上的广泛评估，验证了GraphScope Flex的适应性和效率。结果表明，在LDBC社交网络基准测试和Graphalytics基准测试上，GraphScope Flex的吞吐量分别是其他系统的2.4倍，加速比最高可达55.7倍。此外，在真实应用场景中，GraphScope Flex的性能提升最高达2400倍，充分展示了其在多样化图计算场景中卓越而高效的性能。