Community detection is the problem of identifying densely connected clusters of nodes within a network. The Louvain algorithm is a widely used method for this task, but it can produce communities that are internally disconnected. To address this, the Leiden algorithm was introduced. In this technical report, we propose another approach to mitigate this issue. On a system with two 16-core Intel Xeon Gold 6226R processors, our new parallel algorithm GSP-Louvain, based on the Louvain algorithm, addresses this issue, and outperforms the original Leiden, igraph Leiden, and NetworKit Leiden by 341x, 83x, and 6.1x respectively - achieving a processing rate of 328M edges/s on a 3.8B edge graph. Furthermore, GSP-Louvain improves performance at a rate of 1.5x for every doubling of threads.
翻译:社区检测是在网络中识别紧密连接的节点簇的问题。Louvain算法是解决该问题的常用方法,但可能生成内部不连通的社区。为此,Leiden算法被提出。在本技术报告中,我们提出另一种缓解该问题的方案。在搭载双路16核Intel Xeon Gold 6226R处理器的系统上,我们基于Louvain算法的新并行算法GSP-Louvain解决了该问题,其性能分别比原始Leiden、igraph Leiden和NetworKit Leiden提升341倍、83倍和6.1倍,在38亿条边的图上实现了每秒3.28亿条边的处理速度。此外,线程数每增加一倍,GSP-Louvain的性能提升1.5倍。