Clustering is a fundamental problem in machine learning and operations research. Therefore, given the fact that fairness considerations have become of paramount importance in algorithm design, fairness in clustering has received significant attention from the research community. The literature on fair clustering has resulted in a collection of interesting fairness notions and elaborate algorithms. In this paper, we take a critical view of fair clustering, identifying a collection of ignored issues such as the lack of a clear utility characterization and the difficulty in accounting for the downstream effects of a fair clustering algorithm in machine learning settings. In some cases, we demonstrate examples where the application of a fair clustering algorithm can have significant negative impacts on social welfare. We end by identifying a collection of steps that would lead towards more impactful research in fair clustering.
翻译:聚类是机器学习和运筹学中的一个基本问题。鉴于公平性考量在算法设计中变得至关重要,聚类公平性已受到研究界的广泛关注。公平聚类相关文献已衍生出一系列有趣的公平性概念和精细算法。本文对公平聚类研究采取批判性视角,指出一系列被忽视的问题,例如缺乏明确的效用表征,以及在机器学习场景中难以评估公平聚类算法的下游影响。在某些情况下,我们通过实例证明公平聚类算法的应用可能对社会福利产生显著的负面影响。最后,我们提出一系列可推动公平聚类研究产生更大实际影响的改进方向。