We study the query complexity of the metric Steiner Tree problem, where we are given an $n \times n$ metric on a set $V$ of vertices along with a set $T \subseteq V$ of $k$ terminals, and the goal is to find a tree of minimum cost that contains all terminals in $T$. The query complexity for the related minimum spanning tree (MST) problem is well-understood: for any fixed $\varepsilon > 0$, one can estimate the MST cost to within a $(1+\varepsilon)$-factor using only $\tilde{O}(n)$ queries, and this is known to be tight. This implies that a $(2 + \varepsilon)$-approximate estimate of Steiner Tree cost can be obtained with $\tilde{O}(k)$ queries by simply applying the MST cost estimation algorithm on the metric induced by the terminals. Our first result shows that any (randomized) algorithm that estimates the Steiner Tree cost to within a $(5/3 - \varepsilon)$-factor requires $\Omega(n^2)$ queries, even if $k$ is a constant. This lower bound is in sharp contrast to an upper bound of $O(nk)$ queries for computing a $(5/3)$-approximate Steiner Tree, which follows from previous work by Du and Zelikovsky. Our second main result, and the main technical contribution of this work, is a sublinear query algorithm for estimating the Steiner Tree cost to within a strictly better-than-$2$ factor, with query complexity $\tilde{O}(n^{12/7} + n^{6/7}\cdot k)=\tilde{O}(n^{13/7})=o(n^2)$. We complement this result by showing an $\tilde{\Omega}(n + k^{6/5})$ query lower bound for any algorithm that estimates Steiner Tree cost to a strictly better than $2$ factor. Thus $\tilde{\Omega}(n^{6/5})$ queries are needed to just beat $2$-approximation when $k = \Omega(n)$; a sharp contrast to MST cost estimation where a $(1+o(1))$-approximate estimate of cost is achievable with only $\tilde{O}(n)$ queries.
翻译:我们研究了度量斯坦纳树问题的查询复杂度。该问题的输入为顶点集 $V$ 上的一个 $n \times n$ 度量,以及一个包含 $k$ 个终端的子集 $T \subseteq V$,目标是找到一棵包含 $T$ 中所有终端且成本最小的树。与之相关的最小生成树(MST)问题的查询复杂度已得到充分理解:对于任意固定的 $\varepsilon > 0$,仅需 $\tilde{O}(n)$ 次查询即可将 MST 成本估计至 $(1+\varepsilon)$ 因子内,且已知该界限是紧的。这意味着,通过对终端导出的度量直接应用 MST 成本估计算法,仅需 $\tilde{O}(k)$ 次查询即可获得斯坦纳树成本的 $(2 + \varepsilon)$ 近似估计。我们的第一个结果表明,任何将斯坦纳树成本估计至 $(5/3 - \varepsilon)$ 因子内的(随机化)算法都需要 $\Omega(n^2)$ 次查询,即使 $k$ 为常数。这一下界与 Du 和 Zelikovsky 先前工作所推导出的、计算 $(5/3)$ 近似斯坦纳树所需的 $O(nk)$ 查询上界形成鲜明对比。我们的第二个主要结果(也是本工作的核心技术贡献)是一个亚线性查询算法,该算法能以严格优于 $2$ 的近似比估计斯坦纳树成本,其查询复杂度为 $\tilde{O}(n^{12/7} + n^{6/7}\cdot k)=\tilde{O}(n^{13/7})=o(n^2)$。作为补充,我们证明了任何以严格优于 $2$ 的近似比估计斯坦纳树成本的算法都需要 $\tilde{\Omega}(n + k^{6/5})$ 次查询。因此,当 $k = \Omega(n)$ 时,仅为了超越 $2$ 近似就需要 $\tilde{\Omega}(n^{6/5})$ 次查询;这与 MST 成本估计形成鲜明对比,后者仅需 $\tilde{O}(n)$ 次查询即可实现成本的 $(1+o(1))$ 近似估计。