Mainstream Natural Language Processing (NLP) research has ignored the majority of the world's languages. In moving from excluding the majority of the world's languages to blindly adopting what we make for English, we first risk importing the same harms we have at best mitigated and at least measured for English. However, in evaluating and mitigating harms arising from adopting new technologies into such contexts, we often disregard (1) the actual community needs of Language Technologies, and (2) biases and fairness issues within the context of the communities. In this extended abstract, we consider fairness, bias, and inclusion in Language Technologies through the lens of the Capabilities Approach. The Capabilities Approach centers on what people are capable of achieving, given their intersectional social, political, and economic contexts instead of what resources are (theoretically) available to them. We detail the Capabilities Approach, its relationship to multilingual and multicultural evaluation, and how the framework affords meaningful collaboration with community members in defining and measuring the harms of Language Technologies.
翻译:主流自然语言处理(NLP)研究长期忽视了世界上大多数语言。在从排斥多数语言转向盲目套用为英语设计的模型时,我们首先面临的风险是:可能引入那些在英语语境中至多得到缓解、至少已被测量的相同危害。然而,在评估和缓解将新技术引入此类语境所产生的危害时,我们往往忽视了(1)语言技术的实际社区需求,以及(2)特定社区语境内部的偏见与公平性问题。在这篇扩展摘要中,我们通过能力视角审视语言技术中的公平性、偏见与包容性。能力方法的核心在于关注人们在交叉性的社会、政治和经济背景下实际能够实现的能力,而非(理论上)可获取的资源。我们详细阐述了能力方法、其与多语言多文化评估的关系,以及该框架如何促进与社区成员进行有意义的合作,以界定和衡量语言技术带来的危害。