Mitigating biases in machine learning models has gained increasing attention in Natural Language Processing (NLP). Yet, only a few studies focus on fair text embeddings, which are crucial yet challenging for real-world applications. In this paper, we propose a novel method for learning fair text embeddings. We achieve fairness while maintaining utility trade-off by ensuring conditional independence between sensitive attributes and text embeddings conditioned on the content. Specifically, we enforce that embeddings of texts with different sensitive attributes but identical content maintain the same distance toward the embedding of their corresponding neutral text. Furthermore, we address the issue of lacking proper training data by using Large Language Models (LLMs) to augment texts into different sensitive groups. Our extensive evaluations demonstrate that our approach effectively improves fairness while preserving the utility of embeddings, representing a pioneering effort in achieving conditional independence for fair text embeddings.
翻译:在自然语言处理(NLP)中,缓解机器学习模型中的偏见已日益受到关注。然而,只有少数研究聚焦于公平文本嵌入,而这对于实际应用至关重要且充满挑战。本文提出了一种学习公平文本嵌入的新方法。我们通过确保敏感属性与文本嵌入在内容条件下的条件独立性,在保持效用权衡的同时实现了公平性。具体而言,我们强制要求具有不同敏感属性但内容相同的文本的嵌入,与其对应中性文本的嵌入保持相同距离。此外,我们利用大语言模型(LLMs)将文本扩充至不同敏感组,以解决缺乏适当训练数据的问题。广泛评估表明,我们的方法在保持嵌入效用的同时有效提升了公平性,这是实现公平文本嵌入条件独立性的开创性尝试。