Mitigating biases in machine learning models has gained increasing attention in Natural Language Processing (NLP). Yet, only a few studies focus on fair text embeddings, which are crucial yet challenging for real-world applications. In this paper, we propose a novel method for learning fair text embeddings. We achieve fairness while maintaining utility trade-off by ensuring conditional independence between sensitive attributes and text embeddings conditioned on the content. Specifically, we enforce that embeddings of texts with different sensitive attributes but identical content maintain the same distance toward the embedding of their corresponding neutral text. Furthermore, we address the issue of lacking proper training data by using Large Language Models (LLMs) to augment texts into different sensitive groups. Our extensive evaluations demonstrate that our approach effectively improves fairness while preserving the utility of embeddings, representing a pioneering effort in achieving conditional independence for fair text embeddings.
翻译:在自然语言处理领域,缓解机器学习模型中的偏见日益受到关注。然而,针对公平文本嵌入的研究仍较为有限,而这在实际应用中既至关重要又充满挑战。本文提出了一种学习公平文本嵌入的新方法。通过确保敏感属性与文本嵌入在内容条件下的条件独立性,我们在保持效用权衡的同时实现了公平性。具体而言,我们强制要求具有不同敏感属性但内容相同的文本嵌入,与其对应的中性文本嵌入保持相同距离。此外,我们利用大型语言模型将文本扩充至不同敏感群体,解决了缺乏适当训练数据的问题。广泛评估表明,我们的方法在保持嵌入效用的同时有效提升了公平性,这代表了在实现公平文本嵌入条件独立性方面的开创性工作。