Provably Fast and Space-Efficient Parallel Biconnectivity

Biconnectivity is one of the most fundamental graph problems. The canonical parallel biconnectivity algorithm is the Tarjan-Vishkin algorithm, which has $O(n+m)$ optimal work (number of operations) and polylogarithmic span (longest dependent operations) on a graph with $n$ vertices and $m$ edges. However, Tarjan-Vishkin is not widely used in practice. We believe the reason is the space-inefficiency (it generates an auxiliary graph with $O(m)$ edges). In practice, existing parallel implementations are based on breath-first search (BFS). Since BFS has span proportional to the diameter of the graph, existing parallel BCC implementations suffer from poor performance on large-diameter graphs and can be even slower than the sequential algorithm on many real-world graphs. We propose the first parallel biconnectivity algorithm (FAST-BCC) that has optimal work, polylogarithmic span, and is space-efficient. Our algorithm first generates a skeleton graph based on any spanning tree of the input graph. Then we use the connectivity information of the skeleton to compute the biconnectivity of the original input. All the steps in our algorithm are highly-parallel. We carefully analyze the correctness of our algorithm, which is highly non-trivial. We implemented FAST-BCC and compared it with existing implementations, including GBBS, Slota and Madduri's algorithm, and the sequential Hopcroft-Tarjan algorithm. We ran them on a 96-core machine on 27 graphs, including social, web, road, $k$-NN, and synthetic graphs, with significantly varying sizes and edge distributions. FAST-BCC is the fastest on all 27 graphs. On average (geometric means), FAST-BCC is 5.1$\times$ faster than GBBS, and 3.1$\times$ faster than the best existing baseline on each graph.

翻译：双连通分量是最基础的图论问题之一。经典的并行双连通分量算法是Tarjan-Vishkin算法，该算法在具有$n$个顶点和$m$条边的图上具有$O(n+m)$的最优工作量（操作数量）和多对数跨度（最长依赖操作序列）。然而，Tarjan-Vishkin算法在实践中并未被广泛采用。我们认为其原因在于空间效率低下（它会生成一个包含$O(m)$条边的辅助图）。在实践中，现有的并行实现基于广度优先搜索（BFS）。由于BFS的跨度与图的直径成正比，现有的并行双连通分量实现在大直径图上性能较差，在许多实际应用图上甚至可能比串行算法更慢。我们提出了首个具有最优工作量、多对数跨度且空间高效的并行双连通分量算法（FAST-BCC）。我们的算法首先基于输入图的任意生成树生成骨架图，然后利用骨架图的连通性信息计算原始输入图的双连通分量。算法中的所有步骤都具有高度并行性。我们细致地分析了算法的正确性，这一证明过程具有高度复杂性。我们实现了FAST-BCC算法，并与现有实现（包括GBBS、Slota和Madduri算法以及串行Hopcroft-Tarjan算法）进行了比较。我们在配备96核的机器上对27个图进行了测试，这些图包括社交网络、网页链接、道路网络、$k$近邻图和合成图，其规模和边分布差异显著。FAST-BCC在所有27个图上均表现最快。平均而言（几何平均数），FAST-BCC比GBBS快5.1倍，比每个图上现有的最佳基准算法快3.1倍。