The seminal work of Ahn, Guha, and McGregor in 2012 introduced the graph sketching technique and used it to present the first streaming algorithms for various graph problems over dynamic streams with both insertions and deletions of edges. This includes algorithms for cut sparsification, spanners, matchings, and minimum spanning trees (MSTs). These results have since been improved or generalized in various directions, leading to a vastly rich host of efficient algorithms for processing dynamic graph streams. A curious omission from the list of improvements has been the MST problem. The best algorithm for this problem remains the original AGM algorithm that for every integer $p \geq 1$, uses $n^{1+O(1/p)}$ space in $p$ passes on $n$-vertex graphs, and thus achieves the desired semi-streaming space of $\tilde{O}(n)$ at a relatively high cost of $O(\frac{\log{n}}{\log\log{n}})$ passes. On the other hand, no lower bounds beyond a folklore one-pass lower bound is known for this problem. We provide a simple explanation for this lack of improvements: The AGM algorithm for MSTs is optimal for the entire range of its number of passes! We prove that even for the simplest decision version of the problem -- deciding whether the weight of MSTs is at least a given threshold or not -- any $p$-pass dynamic streaming algorithm requires $n^{1+\Omega(1/p)}$ space. This implies that semi-streaming algorithms do need $\Omega(\frac{\log{n}}{\log\log{n}})$ passes. Our result relies on proving new multi-round communication complexity lower bounds for a variant of the universal relation problem that has been instrumental in proving prior lower bounds for single-pass dynamic streaming algorithms. The proof also involves proving new composition theorems in communication complexity, including majority lemmas and multi-party XOR lemmas, via information complexity approaches.
翻译:2012年Ahn、Guha和McGregor的开创性工作引入了图草图技术,并首次提出了针对动态流(包含边的插入与删除)中多种图问题的流式算法,包括割稀疏化、图跨度、匹配和最小生成树(MST)。这些成果随后在多个方向得到改进或泛化,催生了大量处理动态图流的高效算法。然而,MST问题在改进列表中却明显缺失。该问题的最优算法仍是原始的AGM算法:对于任意整数$p \geq 1$,该算法对$n$顶点图使用$n^{1+O(1/p)}$空间进行$p$遍处理,从而以较高的$O(\frac{\log{n}}{\log\log{n}})$遍代价实现期望的$\tilde{O}(n)$半流式空间。另一方面,除了一则流传的单遍下界外,该问题尚无已知的下界结果。我们为这种改进缺失提供了简单解释:MST问题的AGM算法在其整个遍数范围内已达到最优!我们证明,即使针对该问题最简单的决策版本——判断MST权值是否至少达到给定阈值——任何$p$遍动态流式算法都需要$n^{1+\Omega(1/p)}$空间。这意味着半流式算法确实需要$\Omega(\frac{\log{n}}{\log\log{n}})$遍。我们的结果依赖于对通用关系问题的变体证明新的多轮通信复杂性下界,该变体在先前单遍动态流式算法的下界证明中起到关键作用。证明还涉及通过信息复杂性方法证明通信复杂性中的新组合定理,包括多数引理和多端XOR引理。