Motivated by an application from geodesy, we introduce a novel clustering problem which is a $k$-center (or k-diameter) problem with a side constraint. For the side constraint, we are given an undirected connectivity graph $G$ on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in $G$. We call the resulting problems the connected $k$-center problem and the connected $k$-diameter problem. We prove several results on the complexity and approximability of these problems. Our main result is an $O(\log^2{k})$-approximation algorithm for the connected $k$-center and the connected $k$-diameter problem. For Euclidean metrics and metrics with constant doubling dimension, the approximation factor of this algorithm improves to $O(1)$. We also consider the special cases that the connectivity graph is a line or a tree. For the line we give optimal polynomial-time algorithms and for the case that the connectivity graph is a tree, we either give an optimal polynomial-time algorithm or a $2$-approximation algorithm for all variants of our model. We complement our upper bounds by several lower bounds.
翻译:受大地测量学应用的启发,我们提出了一种带有约束条件的新型聚类问题,该问题本质上是k-中心(或k-直径)问题的变体。在约束条件中,我们给定一个对输入点集定义的连通无向图$G$,并规定聚类方案可行当且仅当每个簇在$G$中均诱导出连通子图。我们将由此产生的问题分别称为连通k-中心问题和连通k-直径问题。本文证明了这些问题的复杂性与近似性方面的若干结论。主要成果是提出了连通k-中心与连通k-直径问题的$O(\log^2{k})$近似算法。对于欧几里得度量以及具有常数倍增维度的度量空间,该算法的近似因子可改进至$O(1)$。我们还考虑了连通图为线状或树状的特殊情形。针对线状图,我们给出了多项式时间最优算法;当连通图为树状时,针对模型所有变体,我们或给出多项式时间最优算法,或给出2-近似算法。最后,我们通过若干下界结果对上界结论进行了补充。