We consider the problem of counting 4-cycles ($C_4$) in an undirected graph $G$ of $n$ vertices and $m$ edges (in bipartite graphs, 4-cycles are also often referred to as $\textit{butterflies}$). There have been a number of previous algorithms for this problem based on sorting the graph by degree and using randomized hash tables. These are appealing in theory due to compact storage and fast access on average. But, the performance of hash tables can degrade unpredictably and are also vulnerable to adversarial input. We develop a new simpler algorithm for counting $C_4$ requiring $O(m\bar\delta(G))$ time and $O(n)$ space, where $\bar \delta(G) \leq O(\sqrt{m})$ is the $\textit{average degeneracy}$ parameter introduced by Burkhardt, Faber \& Harris (2020). It has several practical improvements over previous algorithms; for example, it is fully deterministic, does not require any sorting of the input graph, and uses only addition and array access in its inner loops. To the best of our knowledge, all previous efficient algorithms for $C_4$ counting have required $\Omega(m)$ space in addition to storing the input graph. Our algorithm is very simple and easily adapted to count 4-cycles incident to each vertex and edge. Empirical tests demonstrate that our array-based approach is $4\times$ -- $7\times$ faster on average compared to popular hash table implementations.
翻译:我们研究了在无向图$G$中计数四环($C_4$)的问题(该图包含$n$个顶点和$m$条边;在二分图中,四环常被称为$\textit{蝴蝶}$)。先前已有多种基于度数排序和随机哈希表的算法解决该问题。这些方法因存储紧凑和平均访问速度快而在理论上具有吸引力,但哈希表的性能可能不可预测地下降,且易受对抗性输入影响。我们提出了一种新的更简单的$C_4$计数算法,所需时间为$O(m\bar\delta(G))$,空间为$O(n)$,其中$\bar \delta(G) \leq O(\sqrt{m})$是Burkhardt、Faber和Harris(2020)引入的$\textit{平均退化度}$参数。与先前算法相比,该算法具有多项实用改进:例如,它完全确定、无需对输入图进行任何排序,且内循环仅使用加法和数组访问。据我们所知,此前所有高效的$C_4$计数算法除存储输入图外,还需$\Omega(m)$空间。本算法极为简单,易于调整为统计每个顶点和边关联的四环数量。实验测试表明,与主流哈希表实现相比,我们的基于数组的方法平均速度快$4$倍至$7$倍。