Core decomposition is a well-established graph mining problem with various applications that involves partitioning the graph into hierarchical subgraphs. Solutions to this problem have been developed using both bottom-up and top-down approaches from the perspective of vertex convergence dependency. However, existing algorithms have not effectively harnessed GPU performance to expedite core decomposition, despite the growing need for enhanced performance. Moreover, approaching performance limitations of core decomposition from two different directions within a parallel synchronization structure has not been thoroughly explored. This paper introduces an efficient GPU acceleration framework, PICO, for the Peel and Index2core paradigms of k-core decomposition. We propose PeelOne, a Peel-based algorithm designed to simplify the parallel logic and minimize atomic operations by eliminating vertices that are 'under-core'. We also propose an Index2core-based algorithm, named HistoCore, which addresses the issue of extensive redundant computations across both vertices and edges. Extensive experiments on NVIDIA RTX 3090 GPU show that PeelOne outperforms all other Peel-based algorithms, and HistoCore outperforms all other Index2core-based algorithms. Furthermore, HistoCore even outperforms PeelOne by 1.1x - 3.2x speedup on six datasets, which breaks the stereotype that the Index2core paradigm performs much worse than the Peel in a shared memory parallel setting.
翻译:摘要:核分解是一个成熟的图挖掘问题,具有多种应用,涉及将图划分为层次化子图。从顶点收敛依赖性的角度出发,已有研究采用自底向上和自顶向下的方法开发了该问题的解决方案。然而,尽管对性能提升的需求日益增长,现有算法未能有效利用GPU性能来加速核分解。此外,在并行同步结构内从两个不同方向接近核分解的性能极限尚未得到充分探索。本文提出了一个高效的GPU加速框架PICO,用于k-核分解的Peel和Index2core范式。我们提出了PeelOne,一种基于Peel的算法,旨在通过消除“欠核”顶点来简化并行逻辑并最小化原子操作。我们还提出了一种基于Index2core的算法,称为HistoCore,该算法解决了跨顶点和边的大规模冗余计算问题。在NVIDIA RTX 3090 GPU上的大量实验表明,PeelOne优于所有其他基于Peel的算法,而HistoCore优于所有其他基于Index2core的算法。此外,在六个数据集上,HistoCore甚至比PeelOne实现了1.1倍至3.2倍的加速,这打破了共享内存并行设置中Index2core范式性能远逊于Peel范式的刻板印象。