In the semi-streaming model, an algorithm must process any $n$-vertex graph by making one or few passes over a stream of its edges, use $O(n \cdot \text{polylog }n)$ words of space, and at the end of the last pass, output a solution to the problem at hand. Approximating (single-source) shortest paths on undirected graphs is a longstanding open question in this model. In this work, we make progress on this question from both upper and lower bound fronts: We present a simple randomized algorithm that for any $\epsilon > 0$, with high probability computes $(1+\epsilon)$-approximate shortest paths from a given source vertex in \[ O\left(\frac{1}{\epsilon} \cdot n \log^3 n \right)~\text{space} \quad \text{and} \quad O\left(\frac{1}{\epsilon} \cdot \left(\frac{\log n}{\log\log n} \right) ^2\right) ~\text{passes}. \] The algorithm can also be derandomized and made to work on dynamic streams at a cost of some extra $\text{poly}(\log n, 1/\epsilon)$ factors only in the space. Previously, the best known algorithms for this problem required $1/\epsilon \cdot \log^{c}(n)$ passes, for an unspecified large constant $c$. We prove that any semi-streaming algorithm that with large constant probability outputs any constant approximation to shortest paths from a given source vertex (even to a single fixed target vertex and only the distance, not necessarily the path) requires \[ \Omega\left(\frac{\log n}{\log\log n}\right) ~\text{passes}. \] We emphasize that our lower bound holds for any constant-factor approximation of shortest paths. Previously, only constant-pass lower bounds were known and only for small approximation ratios below two. Our results collectively reduce the gap in the pass complexity of approximating single-source shortest paths in the semi-streaming model from $\text{polylog } n$ vs $\omega(1)$ to only a quadratic gap.
翻译:在半流式模型中,算法必须通过一次或少数几次遍历边流来处理任意$n$顶点图,使用$O(n \cdot \text{polylog }n)$字空间,并在最后一次遍历结束时输出问题的解。在无向图上近似(单源)最短路径是该模型中长期存在的开放问题。本工作中,我们在上界和下界两方面对该问题取得进展:我们提出一种简单的随机算法,对于任意$\epsilon > 0$,该算法以高概率在\[ O\left(\frac{1}{\epsilon} \cdot n \log^3 n \right)~\text{空间} \quad \text{和} \quad O\left(\frac{1}{\epsilon} \cdot \left(\frac{\log n}{\log\log n} \right) ^2\right) ~\text{次遍历} \]内计算给定源顶点的$(1+\epsilon)$-近似最短路径。该算法也可被去随机化,并通过仅在空间上增加$\text{poly}(\log n, 1/\epsilon)$因子的代价,使其能在动态流上工作。此前,该问题最著名的算法需要$1/\epsilon \cdot \log^{c}(n)$次遍历,其中$c$为未指定的较大常数。我们证明,任何以较大常数概率输出给定源顶点最短路径(甚至仅针对单个固定目标顶点且仅输出距离,不一定是路径)的任意常数近似的半流式算法需要\[ \Omega\left(\frac{\log n}{\log\log n}\right) ~\text{次遍历} \]。我们强调,我们的下界适用于最短路径的任意常数因子近似。此前,仅已知常数次遍历的下界,且仅适用于小于二的较小近似比。我们的结果共同将半流式模型中近似单源最短路径的遍历复杂度差距从$\text{polylog } n$与$\omega(1)$缩小至仅二次差距。