Spanning Adjacency Oracles in Sublinear Time

Suppose we are given an $n$-node, $m$-edge input graph $G$, and the goal is to compute a spanning subgraph $H$ on $O(n)$ edges. This can be achieved in linear $O(m + n)$ time via breadth-first search. But can we hope for \emph{sublinear} runtime in some range of parameters? If the goal is to return $H$ as an adjacency list, there are simple lower bounds showing that $\Omega(m + n)$ runtime is necessary. If the goal is to return $H$ as an adjacency matrix, then we need $\Omega(n^2)$ time just to write down the entries of the output matrix. However, we show that neither of these lower bounds still apply if instead the goal is to return $H$ as an \emph{implicit} adjacency matrix, which we call an \emph{adjacency oracle}. An adjacency oracle is a data structure that gives a user the illusion that an adjacency matrix has been computed: it accepts edge queries $(u, v)$, and it returns in near-constant time a bit indicating whether $(u, v) \in E(H)$. Our main result is that one can construct an adjacency oracle for a spanning subgraph on at most $(1+\varepsilon)n$ edges, in $\tilde{O}(n \varepsilon^{-1})$ time, and that this construction time is near-optimal. Additional results include constructions of adjacency oracles for $k$-connectivity certificates and spanners, which are similarly sublinear on dense-enough input graphs. Our adjacency oracles are closely related to Local Computation Algorithms (LCAs) for graph sparsifiers; they can be viewed as LCAs with some computation moved to a preprocessing step, in order to speed up queries. Our oracles imply the first Local algorithm for computing sparse spanning subgraphs of general input graphs in $\tilde{O}(n)$ query time, which works by constructing our adjacency oracle, querying it once, and then throwing the rest of the oracle away. This addresses an open problem of Rubinfeld [CSR '17].

翻译：给定一个包含 $n$ 个节点、$m$ 条边的输入图 $G$，目标是计算一个包含 $O(n)$ 条边的生成子图 $H$。通过广度优先搜索，可以在线性时间 $O(m + n)$ 内实现这一点。但在某些参数范围内，我们能否期望实现 \emph{次线性}运行时间？如果目标是返回 $H$ 的邻接表，则有简单下界表明需要 $\Omega(m + n)$ 的运行时间。如果目标是返回 $H$ 的邻接矩阵，则仅写入输出矩阵的条目就需要 $\Omega(n^2)$ 时间。然而，我们证明，如果目标是将 $H$ 作为 \emph{隐式}邻接矩阵（我们称之为 \emph{邻接预言}）返回，则这些下界均不适用。邻接预言是一种数据结构，它给用户一种已计算邻接矩阵的错觉：它接受边查询 $(u, v)$，并几乎在常数时间内返回一个比特，表示 $(u, v) \in E(H)$ 是否成立。我们的主要结果是：可以在 $\tilde{O}(n \varepsilon^{-1})$ 时间内为至多 $(1+\varepsilon)n$ 条边的生成子图构建一个邻接预言，且该构建时间接近最优。其他结果包括为 $k$-连通性证书和稀疏子图构建邻接预言，这些预言在足够稠密的输入图上同样具有次线性特性。我们的邻接预言与图稀疏化的局部计算算法（LCA）密切相关；它们可视为将部分计算移至预处理阶段以加速查询的 LCA。我们的预言催生了用于计算一般输入图稀疏生成子图的第一个局部算法，其查询时间为 $\tilde{O}(n)$，该算法通过构建邻接预言、查询一次、然后丢弃剩余预言来实现。这解决了 Rubinfeld [CSR '17] 的一个开放问题。