Clustering algorithms are widely used in many societal resource allocation applications, such as loan approvals and candidate recruitment, among others, and hence, biased or unfair model outputs can adversely impact individuals that rely on these applications. To this end, many fair clustering approaches have been recently proposed to counteract this issue. Due to the potential for significant harm, it is essential to ensure that fair clustering algorithms provide consistently fair outputs even under adversarial influence. However, fair clustering algorithms have not been studied from an adversarial attack perspective. In contrast to previous research, we seek to bridge this gap and conduct a robustness analysis against fair clustering by proposing a novel black-box fairness attack. Through comprehensive experiments, we find that state-of-the-art models are highly susceptible to our attack as it can reduce their fairness performance significantly. Finally, we propose Consensus Fair Clustering (CFC), the first robust fair clustering approach that transforms consensus clustering into a fair graph partitioning problem, and iteratively learns to generate fair cluster outputs. Experimentally, we observe that CFC is highly robust to the proposed attack and is thus a truly robust fair clustering alternative.
翻译:聚类算法广泛应用于贷款审批、候选人招聘等众多社会资源分配场景中,因此模型输出中的偏差或不公平性可能对依赖这些应用的个体产生负面影响。为此,近年来研究者提出了多种公平聚类方法以应对该问题。由于潜在的严重危害,确保公平聚类算法即使在对抗性影响下也能持续输出公平结果至关重要。然而,现有研究尚未从对抗攻击视角分析公平聚类算法。与以往工作不同,我们通过提出一种新型黑盒公平性攻击方法,填补了这一空白,并对公平聚类的鲁棒性进行了分析。大量实验表明,当前最优模型极易受到该攻击影响,其公平性性能显著下降。最后,我们提出共识公平聚类(CFC)——首个鲁棒公平聚类方法,该方法将共识聚类转化为公平图划分问题,并通过迭代学习生成公平聚类结果。实验观察到,CFC对所提出的攻击具有高度鲁棒性,因此是真正鲁棒的公平聚类替代方案。