BYO: A Unified Framework for Benchmarking Large-Scale Graph Containers

A fundamental building block in any graph algorithm is a graph container - a data structure used to represent the graph. Ideally, a graph container enables efficient access to the underlying graph, has low space usage, and supports updating the graph efficiently. In this paper, we conduct an extensive empirical evaluation of graph containers designed to support running algorithms on large graphs. To our knowledge, this is the first apples-to-apples comparison of graph containers rather than overall systems, which include confounding factors such as differences in algorithm implementations and infrastructure. We measure the running time of 10 highly-optimized algorithms across over 20 different containers and 10 graphs. Somewhat surprisingly, we find that the average algorithm running time does not differ much across containers, especially those that support dynamic updates. Specifically, a simple container based on an off-the-shelf B-tree is only 1.22x slower on average than a highly optimized static one. Moreover, we observe that simplifying a graph-container Application Programming Interface (API) to only a few simple functions incurs a mere 1.16x slowdown compared to a complete API. Finally, we also measure batch-insert throughput in dynamic-graph containers for a full picture of their performance. To perform the benchmarks, we introduce BYO, a unified framework that standardizes evaluations of graph-algorithm performance across different graph containers. BYO extends the Graph Based Benchmark Suite (Dhulipala et al. 18), a state-of-the-art graph algorithm benchmark, to easily plug into different dynamic graph containers and enable fair comparisons between them on a large suite of graph algorithms. While several graph algorithm benchmarks have been developed to date, to the best of our knowledge, BYO is the first system designed to benchmark graph containers

翻译：任何图算法的基础构建模块都是图容器——一种用于表示图的数据结构。理想的图容器应能高效访问底层图结构、占用较低空间，并支持图的高效更新。本文对旨在支持大规模图算法运行的图容器进行了广泛的实证评估。据我们所知，这是首次对图容器而非整体系统进行同类比较（整体系统包含算法实现和基础设施差异等混淆因素）。我们测量了10种高度优化算法在20余种不同容器及10种图上的运行时间。令人略感意外的是，我们发现不同容器（尤其是支持动态更新的容器）间的平均算法运行时间差异不大。具体而言：基于现成B树的简单容器平均速度仅比高度优化的静态容器慢1.22倍；此外，将图容器应用程序接口简化为仅含少数简单函数时，与完整API相比仅产生1.16倍的减速。最后，我们还测量了动态图容器批量插入吞吐量以全面评估其性能。为执行基准测试，我们引入BYO这一统一框架，该框架标准化了不同图容器上图算法性能的评估。BYO扩展了前沿图算法基准——基于图的基准套件（Dhulipala等，18），使其能轻松接入不同动态图容器，并在大规模图算法套件上实现公平比较。尽管迄今已开发若干图算法基准，但据我们所知，BYO是首个专为图容器基准测试设计的系统。