Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This shows a demand for high-performance libraries capable of loading graphs to (i) accelerate designing new graph algorithms, (ii) to evaluate the contributions on a wide range of graph algorithms, and (iii) to facilitate easy and fast comparison over different graph frameworks. To that end, we present ParaGrapher, a high-performance API and library for loading large-scale and compressed graphs. ParaGrapher supports different types of requests for accessing graphs in shared- and distributed-memory and out-of-core graph processing. We explain the design of ParaGrapher and present a performance model of graph decompression, which is used for evaluation of ParaGrapher over three storage types. Our evaluation shows that by decompressing compressed graphs in WebGraph format, ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution (i.e., through interleaved loading and execution) in comparison to the binary and textual formats. ParaGrapher is available online on https://blogs.qub.ac.uk/DIPSA/ParaGrapher/.
翻译:全面评估是实验科学的基础之一。在高性能图处理领域,通过支持不同框架间的通用输入格式,对研究成果进行彻底评估变得更加可行。然而,每个框架都会创建其特定的格式,这些格式可能无法支持读取大规模真实世界图数据集。这表明需要高性能的图加载库,以实现以下目标:(i) 加速新图算法的设计,(ii) 在广泛的图算法上评估研究成果,(iii) 促进不同图框架间简便快速的比较。为此,我们提出了ParaGrapher——一个用于加载大规模压缩图的高性能API和库。ParaGrapher支持在共享内存、分布式内存以及核外图处理中访问图数据的不同请求类型。我们阐述了ParaGrapher的设计,并提出了图解压缩的性能模型,该模型用于在三种存储类型上评估ParaGrapher。评估结果表明,通过解压缩WebGraph格式的压缩图,与二进制和文本格式相比,ParaGrapher在加载速度上最高可提升3.2倍,在端到端执行(即通过交错加载与执行)速度上最高可提升5.2倍。ParaGrapher可通过https://blogs.qub.ac.uk/DIPSA/ParaGrapher/在线获取。