Sublinear Algorithms and Lower Bounds for Estimating MST and TSP Cost in General Metrics

We consider the design of sublinear space and query complexity algorithms for estimating the cost of a minimum spanning tree (MST) and the cost of a minimum traveling salesman (TSP) tour in a metric on $n$ points. We first consider the $o(n)$-space regime and show that, when the input is a stream of all $\binom{n}{2}$ entries of the metric, for any $\alpha \ge 2$, both MST and TSP cost can be $\alpha$-approximated using $\tilde{O}(n/\alpha)$ space, and that $\Omega(n/\alpha^2)$ space is necessary for this task. Moreover, we show that even if the streaming algorithm is allowed $p$ passes over a metric stream, it still requires $\tilde{\Omega}(\sqrt{n/\alpha p^2})$ space. We next consider the semi-streaming regime, where computing even the exact MST cost is easy and the main challenge is to estimate TSP cost to within a factor that is strictly better than $2$. We show that, if the input is a stream of all edges of the weighted graph that induces the underlying metric, for any $\varepsilon > 0$, any one-pass $(2-\varepsilon)$-approximation of TSP cost requires $\Omega(\varepsilon^2 n^2)$ space; on the other hand, there is an $\tilde{O}(n)$ space two-pass algorithm that approximates the TSP cost to within a factor of 1.96. Finally, we consider the query complexity of estimating metric TSP cost to within a factor that is strictly better than $2$, when the algorithm is given access to a matrix that specifies pairwise distances between all points. For MST estimation in this model, it is known that a $(1+\varepsilon)$-approximation is achievable with $\tilde{O}(n/\varepsilon^{O(1)})$ queries. We design an algorithm that performs $\tilde{O}(n^{1.5})$ distance queries and achieves a strictly better than $2$-approximation when either the metric is known to contain a spanning tree supported on weight-$1$ edges or the algorithm is given access to a minimum spanning tree of the graph.

翻译：我们研究在$n$个点的度量空间中，估计最小生成树（MST）代价和最小旅行商（TSP）环游代价的亚线性空间与查询复杂度算法。首先考虑$o(n)$空间规模，证明当输入为度量所有$\binom{n}{2}$个条目的流时，对于任意$\alpha \ge 2$，MST和TSP代价均可使用$\tilde{O}(n/\alpha)$空间实现$\alpha$-近似，且该任务需要$\Omega(n/\alpha^2)$空间的下界。此外，即使允许流算法在度量流上进行$p$次遍历，仍需$\tilde{\Omega}(\sqrt{n/\alpha p^2})$空间。随后考虑半流计算场景，其中精确计算MST代价较为简单，主要挑战在于以严格优于$2$的因子逼近TSP代价。我们证明：若输入为诱导基础度量加权图的所有边流，对于任意$\varepsilon > 0$，任何单遍$(2-\varepsilon)$-近似TSP代价算法需要$\Omega(\varepsilon^2 n^2)$空间；另一方面，存在一种使用$\tilde{O}(n)$空间的两遍算法能以1.96因子逼近TSP代价。最后，研究在算法可访问指定所有点间成对距离矩阵的条件下，以严格优于$2$的因子估计度量TSP代价的查询复杂度。已知在MST估计的该模型中，使用$\tilde{O}(n/\varepsilon^{O(1)})$次查询可实现$(1+\varepsilon)$-近似。我们设计了一种执行$\tilde{O}(n^{1.5})$次距离查询的算法，当度量已知包含由权重为1的边支撑的生成树，或算法可访问图的最小生成树时，可实现严格优于$2$的近似。