Mitigating biases in machine learning models has become an increasing concern in Natural Language Processing (NLP), particularly in developing fair text embeddings, which are crucial yet challenging for real-world applications like search engines. In response, this paper proposes a novel method for learning fair text embeddings. First, we define a novel content-conditional equal distance (CCED) fairness for text embeddings, ensuring content-conditional independence between sensitive attributes and text embeddings. Building on CCED, we introduce a content-conditional debiasing (CCD) loss to ensure that embeddings of texts with different sensitive attributes but identical content maintain the same distance from the embedding of their corresponding neutral text. Additionally, we tackle the issue of insufficient training data by using Large Language Models (LLMs) with instructions to fairly augment texts into different sensitive groups. Our extensive evaluations show that our approach effectively enhances fairness while maintaining the utility of embeddings. Furthermore, our augmented dataset, combined with the CCED metric, serves as an new benchmark for evaluating fairness.
翻译:缓解机器学习模型中的偏见已成为自然语言处理(NLP)领域日益关注的问题,尤其是在开发公平的文本嵌入方面,这对于搜索引擎等现实应用至关重要且具有挑战性。为此,本文提出了一种学习公平文本嵌入的新方法。首先,我们为文本嵌入定义了一种新颖的内容条件等距(CCED)公平性,确保敏感属性与文本嵌入之间的内容条件独立性。基于CCED,我们引入了内容条件去偏(CCD)损失,以确保具有不同敏感属性但内容相同的文本嵌入,与其对应的中性文本嵌入保持相同距离。此外,我们通过使用指令驱动的大型语言模型(LLMs)将文本公平地增强至不同的敏感群体,以解决训练数据不足的问题。我们广泛的评估表明,我们的方法在保持嵌入实用性的同时,有效提升了公平性。此外,我们增强的数据集结合CCED度量,可作为评估公平性的新基准。