Addressing the challenges of processing massive graphs, which are prevalent in diverse fields such as social, biological, and technical networks, we introduce HeiStreamE and FreightE, two innovative (buffered) streaming algorithms designed for efficient edge partitioning of large-scale graphs. HeiStreamE utilizes an adapted Split-and-Connect graph model and a Fennel-based multilevel partitioning scheme, while FreightE partitions a hypergraph representation of the input graph. Besides ensuring superior solution quality, these approaches also overcome the limitations of existing algorithms by maintaining linear dependency on the graph size in both time and memory complexity with no dependence on the number of blocks of partition. Our comprehensive experimental analysis demonstrates that HeiStreamE outperforms current streaming algorithms and the re-streaming algorithm 2PS in partitioning quality (replication factor), and is more memory-efficient for real-world networks where the number of edges is far greater than the number of vertices. Further, FreightE is shown to produce fast and efficient partitions, particularly for higher numbers of partition blocks.
翻译:针对处理大规模图(这些图广泛存在于社交网络、生物网络和技术网络等不同领域)所面临的挑战,我们提出了HeiStreamE和FreightE两种创新的(缓冲)流式算法,用于实现大规模图的高效边划分。HeiStreamE采用了一种自适应的分裂-连接(Split-and-Connect)图模型和基于Fennel的多级划分方案,而FreightE则对输入图的超图表示进行划分。除了保证优越的解质量外,这些方法还克服了现有算法的局限性,在时间和内存复杂度上均保持与图规模的线性关系,且不依赖于划分块数量。我们的综合实验分析表明,HeiStreamE在划分质量(复制因子)上优于当前流式算法和重流式算法2PS,并且对于边数远大于顶点数的实际网络,其内存效率更高。此外,FreightE被证明能产生快速且高效的划分结果,尤其适用于划分块数量较多的情况。