Discrimination mitigation with machine learning (ML) models could be complicated because multiple factors may interweave with each other including hierarchically and historically. Yet few existing fairness measures are able to capture the discrimination level within ML models in the face of multiple sensitive attributes. To bridge this gap, we propose a fairness measure based on distances between sets from a manifold perspective, named as 'harmonic fairness measure via manifolds (HFM)' with two optional versions, which can deal with a fine-grained discrimination evaluation for several sensitive attributes of multiple values. To accelerate the computation of distances of sets, we further propose two approximation algorithms named 'Approximation of distance between sets for one sensitive attribute with multiple values (ApproxDist)' and 'Approximation of extended distance between sets for several sensitive attributes with multiple values (ExtendDist)' to respectively resolve bias evaluation of one single sensitive attribute with multiple values and that of several sensitive attributes with multiple values. Moreover, we provide an algorithmic effectiveness analysis for ApproxDist under certain assumptions to explain how well it could work. The empirical results demonstrate that our proposed fairness measure HFM is valid and approximation algorithms (i.e., ApproxDist and ExtendDist) are effective and efficient.
翻译:机器学习(ML)模型中的歧视缓解可能变得复杂,因为多种因素(包括层级性和历史性因素)可能相互交织。然而,现有公平性度量方法中,鲜有能够有效捕捉面对多个敏感属性时ML模型内部歧视水平的。为弥补这一差距,我们从流形视角出发,提出一种基于集合间距离的公平性度量方法,命名为“基于流形的调和公平性度量(HFM)”,该方法包含两个可选版本,能够处理多个多值敏感属性的细粒度歧视评估。为加速集合间距离的计算,我们进一步提出了两种近似算法,分别命名为“针对单个多值敏感属性的集合间距离近似算法(ApproxDist)”和“针对多个多值敏感属性的扩展集合间距离近似算法(ExtendDist)”,以分别解决单个多值敏感属性以及多个多值敏感属性的偏见评估问题。此外,我们在特定假设下为ApproxDist提供了算法有效性分析,以解释其可能的工作效果。实证结果表明,我们提出的公平性度量HFM是有效的,并且近似算法(即ApproxDist和ExtendDist)是高效且有效的。