Fair graph clustering is crucial for ensuring equitable representation and treatment of diverse communities in network analysis. Traditional methods often ignore disparities among social, economic, and demographic groups, perpetuating biased outcomes and reinforcing inequalities. This study introduces fair graph clustering within the framework of the disparate impact doctrine, treating it as a joint optimization problem integrating clustering quality and fairness constraints. Given the NP-hard nature of this problem, we employ a semidefinite relaxation approach to approximate the underlying optimization problem. For up to medium-sized graphs, we utilize a singular value decomposition-based algorithm, while for larger graphs, we propose a novel algorithm based on the alternative direction method of multipliers. Unlike existing methods, our formulation allows for tuning the trade-off between clustering quality and fairness. Experimental results on graphs generated from the standard stochastic block model demonstrate the superiority of our approach in achieving an optimal accuracy-fairness trade-off compared to state-of-the-art methods.
翻译:公平图聚类对于确保网络分析中不同社区的公平表征与处理至关重要。传统方法往往忽视社会、经济与人口群体间的差异,导致偏见结果持续存在并加剧不平等。本研究在差异影响原则框架下引入公平图聚类,将其视为融合聚类质量与公平约束的联合优化问题。鉴于该问题的NP难特性,我们采用半定松弛方法对底层优化问题进行近似求解。针对中小规模图,我们使用基于奇异值分解的算法;对于更大规模的图,我们提出一种基于交替方向乘子法的新算法。与现有方法不同,我们的模型允许调节聚类质量与公平性之间的权衡。基于标准随机块模型生成图的实验结果表明,相较于现有先进方法,我们的方法在实现最优准确率-公平性权衡方面具有显著优势。