We consider the classic Correlation Clustering problem: Given a complete graph where edges are labelled either $+$ or $-$, the goal is to find a partition of the vertices that minimizes the number of the \pedges across parts plus the number of the \medges within parts. Recently, Cohen-Addad, Lee and Newman [CLN22] presented a 1.994-approximation algorithm for the problem using the Sherali-Adams hierarchy, hence breaking through the integrality gap of 2 for the classic linear program and improving upon the 2.06-approximation of Chawla, Makarychev, Schramm and Yaroslavtsev [CMSY15]. We significantly improve the state-of-the-art by providing a 1.73-approximation for the problem. Our approach introduces a preclustering of Correlation Clustering instances that allows us to essentially ignore the error arising from the {\em correlated rounding} used by [CLN22]. This additional power simplifies the previous algorithm and analysis. More importantly, it enables a new {\em set-based rounding} that complements the previous roundings. A combination of these two rounding algorithms yields the improved bound.
翻译:我们考虑经典的相关系数聚类问题:给定一个边标签为$+$或$-$的完全图,目标是找到一个顶点划分,使得跨部分的$\pedges$边数量与部分内的$\medges$边数量之和最小。近期,Cohen-Addad、Lee和Newman [CLN22] 利用Sherali-Adams层次提出了该问题的1.994近似算法,从而突破了经典线性规划积分间隙为2的界限,并改进了Chawla、Makarychev、Schramm和Yaroslavtsev [CMSY15] 的2.06近似算法。我们通过提出1.73近似算法显著提升了当前最优结果。该方法引入了相关系数聚类实例的预聚类,使得我们能够基本忽略[CLN22]中使用的{\em 相关舍入}所产生的误差。这种额外能力简化了先前的算法与分析。更重要的是,它实现了一种新的{\em 基于集合的舍入},与之前的舍入方法形成互补。这两种舍入算法的结合得到了改进的近似界。