Cycles are one of the fundamental subgraph patterns and being able to enumerate them in graphs enables important applications in a wide variety of fields, including finance, biology, chemistry, and network science. However, to enable cycle enumeration in real-world applications, efficient parallel algorithms are required. In this work, we propose scalable parallelisation of state-of-the-art sequential algorithms for enumerating simple, temporal, and hop-constrained cycles. First, we focus on the simple cycle enumeration problem and parallelise the algorithms by Johnson and by Read and Tarjan in a fine-grained manner. We theoretically show that our resulting fine-grained parallel algorithms are scalable, with the fine-grained parallel Read-Tarjan algorithm being strongly scalable. In contrast, we show that straightforward coarse-grained parallel versions of these simple cycle enumeration algorithms that exploit edge- or vertex-level parallelism are not scalable. Next, we adapt our fine-grained approach to enable the enumeration of cycles under time-window, temporal, and hop constraints. Our evaluation on a cluster with 256 CPU cores that can execute up to 1024 simultaneous threads demonstrates a near-linear scalability of our fine-grained parallel algorithms when enumerating cycles under the aforementioned constraints. On the same cluster, our fine-grained parallel algorithms achieve, on average, one order of magnitude speedup compared to the respective coarse-grained parallel versions of the state-of-the-art algorithms for cycle enumeration. The performance gap between the fine-grained and the coarse-grained parallel algorithms increases as we use more CPU cores.
翻译:环(cycles)是基础的子图模式之一,在图中枚举它们能够支持金融、生物学、化学及网络科学等广泛领域的重要应用。然而,要在真实应用中实现环路枚举,需要高效的并行算法。本文针对枚举简单、时序及跳数限制环路的最先进串行算法,提出了可扩展的并行化方案。首先聚焦简单环路枚举问题,以细粒度方式并行化Johnson算法以及Read-Tarjan算法。我们从理论上证明,所提出的细粒度并行算法具有可扩展性,其中细粒度并行Read-Tarjan算法达到强可扩展性。反之,我们表明利用边级或顶点级并行性的粗粒度并行版本并不具备可扩展性。随后,我们调整细粒度方法以支持在时间窗口、时序及跳数约束下的环路枚举。在配备256个CPU内核(支持多达1024个并发线程)的集群上进行的评估表明,在所述约束条件下枚举环路时,细粒度并行算法呈现近乎线性的可扩展性。在同一集群上,与最先进环路枚举算法的相应粗粒度并行版本相比,细粒度并行算法平均实现一个数量级的加速比。随着CPU内核数量的增加,细粒度与粗粒度并行算法之间的性能差距进一步扩大。