Community detection is the problem of identifying natural divisions in networks. Efficient parallel algorithms for this purpose are crucial in various applications, particularly as datasets grow to substantial scales. This technical report presents an optimized parallel implementation of the Label Propagation Algorithm (LPA), a high speed community detection method, for shared memory multicore systems. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, our LPA, which we term as GVE-LPA, outperforms FLPA, igraph LPA, and NetworKit LPA by 118,000x, 97,000x, and 40x respectively - achieving a processing rate of 1.4B edges/s on a 3.8B edge graph. In addition, GVE-LPA scales at a rate of 1.7x every doubling of threads.
翻译:社区检测是识别网络中自然划分的问题。针对这一目标的高效并行算法在各类应用中至关重要,尤其是当数据集规模急剧扩大时。本技术报告提出了一种针对共享内存多核系统优化的并行实现方案——标签传播算法(LPA),这是一种高速社区检测方法。在配备双路16核Intel Xeon Gold 6226R处理器的服务器上,我们的LPA实现(命名为GVE-LPA)性能分别比FLPA、igraph LPA和NetworKit LPA提升118,000倍、97,000倍和40倍——在拥有38亿条边的图上达到14亿条边/秒的处理速率。此外,每当线程数翻倍时,GVE-LPA的加速比可达1.7倍。