Community detection is the problem of identifying natural divisions in networks. Efficient parallel algorithms for this purpose are crucial in various applications, particularly as datasets grow to substantial scales. This technical report presents an optimized parallel implementation of the Label Propagation Algorithm (LPA), a high speed community detection method, for shared memory multicore systems. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, our LPA, which we term as GVE-LPA, outperforms FLPA, igraph LPA, and NetworKit LPA by 139x, 97,000x, and 40x respectively - achieving a processing rate of 1.4B edges/s on a 3.8B edge graph. In addition, GVE-LPA scales at a rate of 1.7x every doubling of threads.
翻译:社区检测是识别网络中自然划分的问题。针对此问题的高效并行算法在多种应用中至关重要,尤其是当数据集规模大幅增长时。本技术报告提出了一种针对共享内存多核系统的标签传播算法(LPA)优化并行实现。LPA是一种高速社区检测方法。在配备双路16核Intel Xeon Gold 6226R处理器的服务器上,我们提出的LPA(称为GVE-LPA)相比FLPA、igraph LPA和NetworKit LPA分别实现了139倍、97,000倍和40倍的加速比——在38亿条边的图上达到了每秒14亿条边的处理速率。此外,GVE-LPA在每增加一倍线程数时实现1.7倍的扩展速率。