Motivated by an application from geodesy, we introduce a novel clustering problem which is a $k$-center (or k-diameter) problem with a side constraint. For the side constraint, we are given an undirected connectivity graph $G$ on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in $G$. We call the resulting problems the connected $k$-center problem and the connected $k$-diameter problem. We prove several results on the complexity and approximability of these problems. Our main result is an $O(\log^2{k})$-approximation algorithm for the connected $k$-center and the connected $k$-diameter problem. For Euclidean metrics and metrics with constant doubling dimension, the approximation factor of this algorithm improves to $O(1)$. We also consider the special cases that the connectivity graph is a line or a tree. For the line we give optimal polynomial-time algorithms and for the case that the connectivity graph is a tree, we either give an optimal polynomial-time algorithm or a $2$-approximation algorithm for all variants of our model. We complement our upper bounds by several lower bounds.
翻译:受大地测量学应用驱动,我们提出一种带有附加约束的新型聚类问题——即考虑边约束的k-中心(或k-直径)问题。该附加约束要求:给定输入点集上的无向连通图$G$,仅当每个聚类在$G$中诱导出连通子图时,该聚类方案方可成立。我们将此类问题分别称为连通k-中心问题与连通k-直径问题。我们证明了这些问题在复杂性与近似性方面的若干结论。主要成果是:对连通k-中心与连通k-直径问题,提出了$O(\log^2{k})$近似算法。对于欧几里得度量及具有常数倍增维度的度量空间,该算法的近似因子可改进至$O(1)$。此外,我们探讨了连通图为线或树的特殊情形:针对线结构给出了最优多项式时间算法;针对树结构,则针对模型所有变体要么给出最优多项式时间算法,要么给出2-近似算法。我们用若干下界结果对上界进行了补充。