In fully dynamic clustering problems, a clustering of a given data set in a metric space must be maintained while it is modified through insertions and deletions of individual points. In this paper, we resolve the complexity of fully dynamic $k$-center clustering against both adaptive and oblivious adversaries. Against oblivious adversaries, we present the first algorithm for fully dynamic $k$-center in an arbitrary metric space that maintains an optimal $(2+\epsilon)$-approximation in $O(k \cdot \mathrm{polylog}(n,\Delta))$ amortized update time. Here, $n$ is an upper bound on the number of active points at any time, and $\Delta$ is the aspect ratio of the metric space. Previously, the best known amortized update time was $O(k^2\cdot \mathrm{polylog}(n,\Delta))$, and is due to Chan, Gourqin, and Sozio (2018). Moreover, we demonstrate that our runtime is optimal up to $\mathrm{polylog}(n,\Delta)$ factors. In fact, we prove that even offline algorithms for $k$-clustering tasks in arbitrary metric spaces, including $k$-medians, $k$-means, and $k$-center, must make at least $\Omega(n k)$ distance queries to achieve any non-trivial approximation factor. This implies a lower bound of $\Omega(k)$ which holds even for the insertions-only setting. We also show deterministic lower and upper bounds for adaptive adversaries, demonstrate that an update time sublinear in $k$ is possible against oblivious adversaries for metric spaces which admit locally sensitive hash functions (LSH) and give the first fully dynamic $O(1)$-approximation algorithms for the closely related $k$-sum-of-radii and $k$-sum-of-diameter problems.
翻译:在全动态聚类问题中,需要维护给定度量空间中数据集的聚类结果,同时数据集通过单点的插入和删除操作发生动态变化。本文针对自适应与弱对抗两种攻击者模型,彻底解决了全动态$k$-中心聚类的复杂度问题。针对弱对抗攻击者,我们提出了首个在任意度量空间中维护全动态$k$-中心聚类的算法,该算法在$O(k \cdot \mathrm{polylog}(n,\Delta))$均摊更新时间内维持最优的$(2+\epsilon)$近似比。其中,$n$为任意时刻活跃点数量的上界,$\Delta$为度量空间的宽高比。此前已知的最佳均摊更新时间为$O(k^2\cdot \mathrm{polylog}(n,\Delta))$,由Chan、Gourqin和Sozio(2018)提出。进一步,我们证明该运行时间在$\mathrm{polylog}(n,\Delta)$因子范围内是最优的。实际上,我们证明即使对于任意度量空间中的$k$-聚类任务(包括$k$-中位数、$k$-均值和$k$-中心)的离线算法,要获得任何非平凡近似因子,至少需要$\Omega(n k)$次距离查询。这推导出在仅插入场景下仍成立的$\Omega(k)$下界。针对自适应攻击者,我们给出了确定性的下界与上界;证明对于支持局部敏感哈希函数(LSH)的度量空间,针对弱对抗攻击者可实现次线性于$k$的更新时间;并首次为紧密相关的$k$-半径和与$k$-直径和问题提出全动态$O(1)$近似算法。