Community detection in graphs identifies groups of nodes with denser connections within the groups than between them, and while existing studies often focus on optimizing detection performance, memory constraints become critical when processing large graphs on shared-memory systems. We recently proposed efficient implementations of the Louvain, Leiden, and Label Propagation Algorithms (LPA) for community detection. However, these incur significant memory overhead from the use of collision-free per-thread hashtables. To address this, we introduce memory-efficient alternatives using weighted Misra-Gries (MG) sketches, which replace the per-thread hashtables, and reduce memory demands in Louvain, Leiden, and LPA implementations - while incurring only a minor quality drop (up to 1%) and moderate runtime penalties. We believe that these approaches, though slightly slower, are well-suited for parallel processing and could outperform current memory-intensive techniques on systems with many threads.
翻译:图上的社区检测旨在识别组内连接比组间连接更紧密的节点群组。现有研究通常侧重于优化检测性能,但在共享内存系统上处理大规模图时,内存约束变得至关重要。我们近期提出了Louvain、Leiden和标签传播算法(LPA)的高效实现用于社区检测。然而,这些方法因使用无冲突的每线程哈希表而产生显著的内存开销。为解决此问题,我们引入了基于加权Misra-Gries(MG)草图的内存高效替代方案,该方案取代了每线程哈希表,并降低了Louvain、Leiden和LPA实现的内存需求——同时仅导致轻微的质量下降(最高1%)和适度的运行时损失。我们认为,这些方法虽然速度稍慢,但非常适合并行处理,在多线程系统上可能优于当前内存密集型技术。