We study discrete k-clustering problems in general metric spaces that are constrained by a combination of two different fairness conditions within the demographic fairness model. Given a metric space (P,d), where every point in P is equipped with a protected attribute, and a number k, the goal is to partition P into k clusters with a designated center each, such that a center-based objective function is minimized and the attributes are fairly distributed with respect to the following two fairness concepts: 1) group fairness: We aim for clusters with balanced numbers of attributes by specifying lower and upper bounds for the desired attribute proportions. 2) diverse center selection: Clusters have natural representatives, i.e., their centers. We ask for a balanced set of representatives by specifying the desired number of centers to choose from each attribute. Dickerson, Esmaeili, Morgenstern and Zhang (2023) denote the combination of these two constraints as doubly constrained fair clustering. They present algorithms whose guarantees depend on the best known approximation factors for either of these problems. Currently, this implies an 8-approximation with a small additive violation on the group fairness constraint. For k-center, we improve this approximation factor to 4 with a small additive violation. This guarantee also depends on the currently best algorithm for DS-fair k-center given by Jones, Nguyen and Nguyen (2020). For k-median and k-means, we propose the first constant-factor approximation algorithms. Our algorithms transform a solution that satisfies diverse center selection into a doubly constrained fair clustering using an LP-based approach. Furthermore, our results are generalizable to other center-selection constraints, such as matroid k-clustering and knapsack constraints.
翻译:我们研究一般度量空间中受人口统计公平模型内两种不同公平条件联合约束的离散k-聚类问题。给定度量空间(P,d),其中每个点都配备一个受保护属性,以及一个整数k,目标是将P划分为k个簇(每个簇指定一个中心),使得基于中心的代价函数最小化,且属性按照以下两种公平概念实现公平分布:1) 群体公平:通过指定期望属性比例的下界和上界,追求各簇中属性数量的平衡。2) 多样化中心选择:簇具有自然代表(即中心)。我们要求通过指定从每个属性中选择的期望中心数量,获得平衡的代表集。Dickerson、Esmaeili、Morgenstern和Zhang (2023)将这两种约束的组合称为双重约束公平聚类。他们提出的算法保证依赖于这两个问题中已知最佳近似因子。目前,这给出了一个8-近似算法,在群体公平约束上存在微小加法违规。对于k-中心问题,我们将近似因子改进为4,同时保持微小加法违规。该保证还依赖于Jones、Nguyen和Nguyen (2020)提出的DS-公平k-中心最佳算法。对于k-中位数和k-均值问题,我们首次提出常数因子近似算法。我们的算法通过基于线性规划的方法,将满足多样化中心选择的解转化为双重约束公平聚类。此外,我们的结果可推广至其他中心选择约束,如拟阵k-聚类和背包约束。