Clustering algorithms may unintentionally propagate or intensify existing disparities, leading to unfair representations or biased decision-making. Current fair clustering methods rely on notions of fairness that do not capture any information on the underlying causal mechanisms. We show that optimising for non-causal fairness notions can paradoxically induce direct discriminatory effects from a causal standpoint. We present a clustering approach that incorporates causal fairness metrics to provide a more nuanced approach to fairness in unsupervised learning. Our approach enables the specification of the causal fairness metrics that should be minimised. We demonstrate the efficacy of our methodology using datasets known to harbour unfair biases.
翻译:聚类算法可能无意中传播或加剧现有差异,导致不公平的表征或偏颇的决策制定。当前的公平聚类方法所依赖的公平性概念并未捕捉任何关于潜在因果机制的信息。我们表明,从因果角度优化非因果的公平性概念反而会悖论性地诱发直接的歧视效应。我们提出了一种聚类方法,该方法融入因果公平性度量,以在无监督学习中提供更细致的公平性处理。我们的方法能够指定需要最小化的因果公平性度量。我们利用已知存在不公平偏差的数据集验证了该方法的有效性。