Community detection is the problem of identifying densely connected clusters within a network. While the Louvain algorithm is commonly used for this task, it can produce internally-disconnected communities. To address this, the Leiden algorithm was introduced. This technical report introduces GSP-Louvain, a parallel algorithm based on Louvain, which mitigates this issue. Running on a system with two 16-core Intel Xeon Gold 6226R processors, GSP-Louvain outperforms Leiden, NetworKit Leiden, and cuGraph Leiden by 391x, 6.9x, and 2.6x respectively, processing 410M edges per second on a 3.8B edge graph. Furthermore, GSP-Louvain improves performance at a rate of 1.5x for every doubling of threads.
翻译:社区检测是识别网络中密集连接簇的问题。虽然Louvain算法常用于此任务,但它可能产生内部不连通的社区。为解决此问题,Leiden算法被提出。本技术报告介绍了GSP-Louvain,一种基于Louvain的并行算法,可缓解该问题。在配备两颗16核Intel Xeon Gold 6226R处理器的系统上运行,GSP-Louvain的性能分别比Leiden、NetworKit Leiden和cuGraph Leiden快391倍、6.9倍和2.6倍,在包含38亿条边的图上每秒处理4.1亿条边。此外,GSP-Louvain在线程数每增加一倍时,性能提升速率达1.5倍。