Individual preference (IP) stability, introduced by Ahmadi et al. (ICML 2022), is a natural clustering objective inspired by stability and fairness constraints. A clustering is $\alpha$-IP stable if the average distance of every data point to its own cluster is at most $\alpha$ times the average distance to any other cluster. Unfortunately, determining if a dataset admits a $1$-IP stable clustering is NP-Hard. Moreover, before this work, it was unknown if an $o(n)$-IP stable clustering always \emph{exists}, as the prior state of the art only guaranteed an $O(n)$-IP stable clustering. We close this gap in understanding and show that an $O(1)$-IP stable clustering always exists for general metrics, and we give an efficient algorithm which outputs such a clustering. We also introduce generalizations of IP stability beyond average distance and give efficient, near-optimal algorithms in the cases where we consider the maximum and minimum distances within and between clusters.
翻译:个体偏好(IP)稳定性由 Ahmadi 等人(ICML 2022)提出,是一种受稳定性和公平性约束启发的自然聚类目标。若每个数据点与其自身聚类的平均距离不超过其到任意其他聚类平均距离的 $\alpha$ 倍,则该聚类是 $\alpha$-IP 稳定的。不幸的是,判定数据集是否存在 $1$-IP 稳定聚类是 NP 困难的。此外,在此工作之前,是否总是存在 $o(n)$-IP 稳定聚类尚属未知,因为先前最先进的结果仅能保证 $O(n)$-IP 稳定聚类的存在性。我们填补了这一认知空白,证明对于一般度量空间,始终存在 $O(1)$-IP 稳定聚类,并给出一种能输出此类聚类的高效算法。我们还引入了超出平均距离的 IP 稳定性推广形式,并在考虑聚类内与聚类间最大及最小距离的情形下,给出了高效且近乎最优的算法。