Partitioning the vertices of a (hyper)graph into k roughly balanced blocks such that few (hyper)edges run between blocks is a key problem for large-scale distributed processing. A current trend for partitioning huge (hyper)graphs using low computational resources are streaming algorithms. In this work, we propose FREIGHT: a Fast stREamInG Hypergraph parTitioning algorithm which is an adaptation of the widely-known graph-based algorithm Fennel. By using an efficient data structure, we make the overall running of FREIGHT linearly dependent on the pin-count of the hypergraph and the memory consumption linearly dependent on the numbers of nets and blocks. The results of our extensive experimentation showcase the promising performance of FREIGHT as a highly efficient and effective solution for streaming hypergraph partitioning. Our algorithm demonstrates competitive running time with the Hashing algorithm, with a difference of a maximum factor of four observed on three fourths of the instances. Significantly, our findings highlight the superiority of FREIGHT over all existing (buffered) streaming algorithms and even the in-memory algorithm HYPE, with respect to both cut-net and connectivity measures. This indicates that our proposed algorithm is a promising hypergraph partitioning tool to tackle the challenge posed by large-scale and dynamic data processing.
翻译:将(超)图的顶点划分为k个大致平衡的块,使得块之间仅有少量(超)边相连,是大规模分布式处理中的关键问题。当前利用低计算资源划分超大规模(超)图的主流趋势是采用流式算法。本文提出FREIGHT:一种快速流式超图划分算法,该算法是对广泛使用的基于图的Fennel算法的改进。通过使用高效的数据结构,我们将FREIGHT的整体运行时间线性依赖于超图的引脚计数,内存消耗线性依赖于网络和块的数量。大量实验结果表明,FREIGHT作为一种高效且有效的流式超图划分解决方案,展现出令人瞩目的性能。我们的算法运行时间与哈希算法相比具有竞争力,在四分之三的实例中观察到最大差异因子为四。尤为重要的是,我们的研究结果凸显了FREIGHT在割网和连通性指标上均优于所有现有(缓冲)流式算法,甚至优于内存算法HYPE。这表明,我们提出的算法是应对大规模动态数据处理挑战的一种极具前景的超图划分工具。