As machine learning applications proliferate, we need an understanding of their potential for harm. However, current fairness metrics are rarely grounded in human psychological experiences of harm. Drawing on the social psychology of stereotypes, we use a case study of gender stereotypes in image search to examine how people react to machine learning errors. First, we use survey studies to show that not all machine learning errors reflect stereotypes nor are equally harmful. Then, in experimental studies we randomly expose participants to stereotype-reinforcing, -violating, and -neutral machine learning errors. We find stereotype-reinforcing errors induce more experientially (i.e., subjectively) harmful experiences, while having minimal changes to cognitive beliefs, attitudes, or behaviors. This experiential harm impacts women more than men. However, certain stereotype-violating errors are more experientially harmful for men, potentially due to perceived threats to masculinity. We conclude that harm cannot be the sole guide in fairness mitigation, and propose a nuanced perspective depending on who is experiencing what harm and why.
翻译:随着机器学习应用的激增,我们需要理解其潜在的伤害。然而,当前的公平性指标很少基于人类对伤害的心理体验。借鉴社会心理学中的刻板印象理论,我们以图像搜索中的性别刻板印象为案例,研究人们对机器学习错误的反应。首先,通过调查研究,我们表明并非所有机器学习错误都反映刻板印象,也并非所有错误都同样有害。随后,在实验研究中,我们随机让参与者接触强化刻板印象、违反刻板印象和中性刻板印象的机器学习错误。我们发现,强化刻板印象的错误会引发更多体验性(即主观)伤害,而对认知信念、态度或行为的影响甚微。这种体验性伤害对女性的影响大于男性。然而,某些违反刻板印象的错误对男性更具体验性伤害,这可能是由于对男性气质感知的威胁。我们得出结论:伤害不能成为公平性干预的唯一指南,并提出了一个根据谁在经历何种伤害及其原因而定的细致入微的视角。