Spanning Adjacency Oracles in Sublinear Time

Suppose we are given an $n$-node, $m$-edge input graph $G$, and the goal is to compute a spanning subgraph $H$ on $O(n)$ edges. This can be achieved in linear $O(m + n)$ time via breadth-first search. But can we hope for \emph{sublinear} runtime in some range of parameters? If the goal is to return $H$ as an adjacency list, there are simple lower bounds showing that $\Omega(m + n)$ runtime is necessary. If the goal is to return $H$ as an adjacency matrix, then we need $\Omega(n^2)$ time just to write down the entries of the output matrix. However, we show that neither of these lower bounds still apply if instead the goal is to return $H$ as an \emph{implicit} adjacency matrix, which we call an \emph{adjacency oracle}. An adjacency oracle is a data structure that gives a user the illusion that an adjacency matrix has been computed: it accepts edge queries $(u, v)$, and it returns in near-constant time a bit indicating whether $(u, v) \in E(H)$. Our main result is that one can construct an adjacency oracle for a spanning subgraph on at most $(1+\varepsilon)n$ edges, in $\tilde{O}(n \varepsilon^{-1})$ time, and that this construction time is near-optimal. Additional results include constructions of adjacency oracles for $k$-connectivity certificates and spanners, which are similarly sublinear on dense-enough input graphs. Our adjacency oracles are closely related to Local Computation Algorithms (LCAs) for graph sparsifiers; they can be viewed as LCAs with some computation moved to a preprocessing step, in order to speed up queries. Our oracles imply the first Local algorithm for computing sparse spanning subgraphs of general input graphs in $\tilde{O}(n)$ query time, which works by constructing our adjacency oracle, querying it once, and then throwing the rest of the oracle away. This addresses an open problem of Rubinfeld [CSR '17].

翻译：假设给定一个 $n$ 节点、$m$ 边的输入图 $G$，目标是计算一个包含 $O(n)$ 条边的生成子图 $H$。通过广度优先搜索，可以在线性时间 $O(m + n)$ 内实现这一目标。但在某些参数范围内，我们能否期望实现\textit{亚线性}运行时间？如果目标是以邻接表形式返回 $H$，则存在简单下界表明需要 $\Omega(m + n)$ 的运行时间。如果目标是以邻接矩阵形式返回 $H$，则仅写出输出矩阵的条目就需要 $\Omega(n^2)$ 的时间。然而，我们证明若目标是以\textit{隐式}邻接矩阵（我们称之为\textit{邻接预言}）形式返回 $H$，则这些下界均不再适用。邻接预言是一种数据结构，它为用户提供已计算出邻接矩阵的假象：它接受边查询 $(u, v)$，并在近似常数时间内返回一个比特位，指示 $(u, v)$ 是否属于 $E(H)$。我们的主要结果是：可以在 $\tilde{O}(n \varepsilon^{-1})$ 时间内构造一个包含至多 $(1+\varepsilon)n$ 条边的生成子图的邻接预言，且该构造时间接近最优。其他结果包括为 $k$-连通性证书和生成子图（spanners）构造邻接预言，这些构造在足够稠密的输入图上同样具有亚线性复杂度。我们的邻接预言与用于图稀疏化的局部计算算法（LCA）密切相关；它们可视为将部分计算移至预处理阶段的 LCA，以加速查询。我们的预言推导出首个对一般输入图计算稀疏生成子图的局部算法，其查询时间为 $\tilde{O}(n)$，该算法通过构造我们的邻接预言、查询一次，然后丢弃预言的其余部分来实现。这解决了 Rubinfeld [CSR '17] 提出的一个开放问题。