PageRank is a popular centrality metric that assigns importance to the vertices of a graph based on its neighbors and their score. Efficient parallel algorithms for updating PageRank on dynamic graphs is crucial for various applications, especially as dataset sizes have reached substantial scales. This technical report presents our Dynamic Frontier approach. Given a batch update of edge deletion and insertions, it progressively identifies affected vertices that are likely to change their ranks with minimal overhead. On a server equipped with a 64-core AMD EPYC-7742 processor, our Dynamic Frontier PageRank outperforms Static, Naive-dynamic, and Dynamic Traversal PageRank by 7.8x, 2.9x, and 3.9x respectively - on uniformly random batch updates of size 10^-7 |E| to 10^-3 |E|. In addition, our approach improves performance at an average rate of 1.8x for every doubling of threads.
翻译:PageRank是一种流行的中心性度量指标,它根据邻居节点及其得分对图中的顶点赋予重要性。针对动态图的高效并行PageRank更新算法对于各类应用至关重要,尤其是在数据集规模已达到相当量级的情况下。本技术报告提出了动态前沿方法。给定边删除与插入的批量更新操作,该方法能以最小开销逐步识别可能发生排名变化的受影响顶点。在配备64核AMD EPYC-7742处理器的服务器上,我们的动态前沿PageRank在规模为10^-7|E|至10^-3|E|的均匀随机批量更新中,性能分别比静态PageRank、朴素动态PageRank和动态遍历PageRank提升7.8倍、2.9倍和3.9倍。此外,每将线程数翻倍,我们的方法平均可带来1.8倍的性能提升。