This paper considers correlation clustering on unweighted complete graphs. We give a combinatorial algorithm that returns a single clustering solution that is simultaneously $O(1)$-approximate for all $\ell_p$-norms of the disagreement vector; in other words, a combinatorial $O(1)$-approximation of the all-norms objective for correlation clustering. This is the first proof that minimal sacrifice is needed in order to optimize different norms of the disagreement vector. In addition, our algorithm is the first combinatorial approximation algorithm for the $\ell_2$-norm objective, and more generally the first combinatorial algorithm for the $\ell_p$-norm objective when $1 < p < \infty$. It is also faster than all previous algorithms that minimize the $\ell_p$-norm of the disagreement vector, with run-time $O(n^\omega)$, where $O(n^\omega)$ is the time for matrix multiplication on $n \times n$ matrices. When the maximum positive degree in the graph is at most $\Delta$, this can be improved to a run-time of $O(n\Delta^2 \log n)$.
翻译:本文研究无权重完全图上的关联聚类问题。我们提出一种组合算法,能够返回单个聚类解,该解对所有分歧向量的 $\ell_p$-范数均同时实现 $O(1)$-近似;换言之,该算法为关联聚类的全范数目标提供了组合 $O(1)$-近似。这首次证明了在优化分歧向量的不同范数时,仅需付出极小代价。此外,我们的算法是首个针对 $\ell_2$-范数目标的组合近似算法,更一般地,当 $1 < p < \infty$ 时,它也是首个针对 $\ell_p$-范数目标的组合算法。该算法比所有此前最小化分歧向量 $\ell_p$-范数的算法更快,运行时间为 $O(n^\omega)$,其中 $O(n^\omega)$ 是 $n \times n$ 矩阵乘法的时间。当图中最大正度数不超过 $\Delta$ 时,该运行时间可改进为 $O(n\Delta^2 \log n)$。