Assessing fairness in artificial intelligence (AI) typically involves AI experts who select protected features, fairness metrics, and set fairness thresholds to assess outcome fairness. However, little is known about how stakeholders, particularly those affected by AI outcomes but lacking AI expertise, assess fairness. To address this gap, we conducted a qualitative study with 26 stakeholders without AI expertise, representing potential decision subjects in a credit rating scenario, to examine how they assess fairness when placed in the role of deciding on features with priority, metrics, and thresholds. We reveal that stakeholders' fairness decisions are more complex than typical AI expert practices: they considered features far beyond legally protected features, tailored metrics for specific contexts, set diverse yet stricter fairness thresholds, and even preferred designing customized fairness. Our results extend the understanding of how stakeholders can meaningfully contribute to AI fairness governance and mitigation, underscoring the importance of incorporating stakeholders' nuanced fairness judgments.
翻译:人工智能(AI)公平性评估通常由AI专家负责选择受保护特征、设定公平性指标并确定公平性阈值以评估结果公平性。然而,对于利益相关者——特别是那些受AI决策影响但缺乏AI专业知识的群体——如何进行公平性评估,目前仍知之甚少。为填补这一研究空白,我们开展了一项定性研究,招募了26位不具备AI专业知识的利益相关者(代表信用评分场景中的潜在决策对象),考察他们在被赋予决策权时如何选择优先特征、评估指标及阈值。研究发现,利益相关者的公平性决策比典型的AI专家实践更为复杂:他们考虑的特征远超出法律规定的受保护特征范围,针对具体情境定制评估指标,设定更严格且多样化的公平性阈值,甚至倾向于设计定制化的公平性方案。本研究拓展了对利益相关者如何实质性参与AI公平性治理与缓解机制的理解,强调了纳入利益相关者精细化公平性判断的重要性。