This paper proposes efficient solutions for $k$-core decomposition with high parallelism. The problem of $k$-core decomposition is fundamental in graph analysis and has applications across various domains. However, existing algorithms face significant challenges in achieving work-efficiency in theory and/or high parallelism in practice, and suffer from various performance bottlenecks. We present a simple, work-efficient parallel framework for $k$-core decomposition that is easy to implement and adaptable to various strategies for improving work-efficiency. We introduce two techniques to enhance parallelism: a sampling scheme to reduce contention on high-degree vertices, and vertical granularity control (VGC) to mitigate scheduling overhead for low-degree vertices. Furthermore, we design a hierarchical bucket structure to optimize performance for graphs with high coreness values. We evaluate our algorithm on a diverse set of real-world and synthetic graphs. Compared to state-of-the-art parallel algorithms, including ParK, PKC, and Julienne, our approach demonstrates superior performance on 23 out of 25 graphs when tested on a 96-core machine. Our algorithm shows speedups of up to 315$\times$ over ParK, 33.4$\times$ over PKC, and 52.5$\times$ over Julienne.
翻译:本文针对$k$-核分解问题提出了高效的并行解决方案。$k$-核分解是图分析中的基础问题,在多个领域具有广泛应用。然而,现有算法在理论上难以实现工作高效性,在实践层面也难以获得高并行度,且存在多种性能瓶颈。我们提出了一种简洁、工作高效的并行$k$-核分解框架,该框架易于实现,并能适配多种提升工作高效性的策略。我们引入了两种增强并行性的技术:通过采样方案降低对高度数顶点的访问竞争,以及通过垂直粒度控制(VGC)减少低度数顶点的调度开销。此外,我们设计了分层桶结构以优化高核值图的处理性能。我们在多种真实世界图与合成图上评估了算法性能。在96核机器上的测试结果表明,相较于ParK、PKC和Julienne等前沿并行算法,我们的方法在25个测试图中的23个上展现出更优性能。本算法相比ParK最高可实现315倍加速,相比PKC最高可达33.4倍加速,相比Julienne最高可达52.5倍加速。