The rise of graph analytics platforms has led to the development of various benchmarks for evaluating and comparing platform performance. However, existing benchmarks often fall short of fully assessing performance due to limitations in core algorithm selection, data generation processes (and the corresponding synthetic datasets), as well as the neglect of API usability evaluation. To address these shortcomings, we propose a novel graph analytics benchmark. First, we select eight core algorithms by extensively reviewing both academic and industrial settings. Second, we design an efficient and flexible data generator and produce eight new synthetic datasets as the default datasets for our benchmark. Lastly, we introduce a multi-level large language model (LLM)-based framework for API usability evaluation-the first of its kind in graph analytics benchmarks. We conduct comprehensive experimental evaluations on existing platforms (GraphX, PowerGraph, Flash, Grape, Pregel+, Ligra and G-thinker). The experimental results demonstrate the superiority of our proposed benchmark.
翻译:图分析平台的兴起催生了多种用于评估和比较平台性能的基准测试。然而,现有基准测试常因核心算法选择、数据生成过程(及相应的合成数据集)的局限性,以及对API可用性评估的忽视,而难以全面评估性能。为弥补这些不足,我们提出了一种新颖的图分析基准测试。首先,我们通过广泛调研学术与工业环境,选取了八种核心算法。其次,我们设计了一个高效灵活的数据生成器,并生成了八个新的合成数据集作为本基准测试的默认数据集。最后,我们引入了一个基于大语言模型(LLM)的多层次API可用性评估框架——这在图分析基准测试中尚属首次。我们对现有平台(GraphX、PowerGraph、Flash、Grape、Pregel+、Ligra和G-thinker)进行了全面的实验评估。实验结果证明了我们提出的基准测试的优越性。