Natural Language Processing (NLP) plays an important role in our daily lives, particularly due to the enormous progress of Large Language Models (LLM). However, NLP has many fairness-critical use cases, e.g., as an expert system in recruitment or as an LLM-based tutor in education. Since NLP is based on human language, potentially harmful biases can diffuse into NLP systems and produce unfair results, discriminate against minorities or generate legal issues. Hence, it is important to develop a fairness certification for NLP approaches. We follow a qualitative research approach towards a fairness certification for NLP. In particular, we have reviewed a large body of literature on algorithmic fairness, and we have conducted semi-structured expert interviews with a wide range of experts from that area. We have systematically devised six fairness criteria for NLP, which can be further refined into 18 sub-categories. Our criteria offer a foundation for operationalizing and testing processes to certify fairness, both from the perspective of the auditor and the audited organization.
翻译:自然语言处理(NLP)在人们的日常生活中扮演着重要角色,尤其是在大语言模型(LLM)取得巨大进展的背景下。然而,NLP存在诸多公平性关键应用场景,例如招聘系统中的专家系统或基于LLM的教育辅导工具。由于NLP基于人类语言,潜在的有害偏见可能渗透进NLP系统,导致不公平结果、对少数群体产生歧视或引发法律问题。因此,为NLP方法建立公平性认证至关重要。我们采用定性研究方法探索NLP的公平性认证。具体而言,我们系统梳理了关于算法公平性的大量文献,并对该领域的多位专家开展了半结构化访谈。在此基础上,我们审慎设计了六项NLP公平性准则,这些准则可进一步细分为18个子类别。无论是从审计方还是被审计组织的视角,我们的准则都为实施和测试公平性认证流程奠定了基础。