In a metric space, a set of point sets of roughly the same size and an integer $k\geq 1$ are given as the input and the goal of data-distributed $k$-center is to find a subset of size $k$ of the input points as the set of centers to minimize the maximum distance from the input points to their closest centers. Metric $k$-center is known to be NP-hard which carries to the data-distributed setting. We give a $2$-approximation algorithm of $k$-center for sublinear $k$ in the data-distributed setting, which is tight. This algorithm works in several models, including the massively parallel computation model (MPC).
翻译:在度量空间中,给定一组规模大致相同的点集和一个整数$k\geq 1$作为输入,数据分布式k-中心问题的目标是从输入点中选取一个大小为$k$的子集作为中心集,以最小化从输入点到其最近中心的最大距离。度量k-中心问题已知是NP难的,该性质同样适用于数据分布式场景。我们针对数据分布式设置中子线性规模的$k$,给出了一种紧的2-近似算法。该算法适用于多种模型,包括大规模并行计算模型(MPC)。