Collaborative filtering based recommendation learns users' preferences from all users' historical behavior data, and has been popular to facilitate decision making. R Recently, the fairness issue of recommendation has become more and more essential. A recommender system is considered unfair when it does not perform equally well for different user groups according to users' sensitive attributes~(e.g., gender, race). Plenty of methods have been proposed to alleviate unfairness by optimizing a predefined fairness goal or changing the distribution of unbalanced training data. However, they either suffered from the specific fairness optimization metrics or relied on redesigning the current recommendation architecture. In this paper, we study how to improve recommendation fairness from the data augmentation perspective. The recommendation model amplifies the inherent unfairness of imbalanced training data. We augment imbalanced training data towards balanced data distribution to improve fairness. The proposed framework is generally applicable to any embedding-based recommendation, and does not need to pre-define a fairness metric. Extensive experiments on two real-world datasets clearly demonstrate the superiority of our proposed framework. We publish the source code at https://github.com/newlei/FDA.
翻译:基于协同过滤的推荐系统通过学习所有用户的历史行为数据来推断用户偏好,已成为辅助决策的常用手段。近年来,推荐系统的公平性问题日益受到关注。当推荐系统对不同用户群体(如根据性别、种族等敏感属性划分)的表现存在显著差异时,即被视为不公平。现有方法主要通过优化预设的公平性目标或调整不平衡训练数据的分布来缓解不公平现象,但这些方法要么受限于特定的公平性优化指标,要么需要重新设计现有推荐架构。本文从数据增强角度研究如何提升推荐公平性。推荐模型会放大不平衡训练数据中固有的不公平性,我们通过将不平衡训练数据增强至平衡数据分布来改善公平性。所提框架普遍适用于任何基于嵌入的推荐模型,且无需预先定义公平性度量指标。在两个真实数据集上的大量实验充分证明了该框架的优越性。开源代码已发布至 https://github.com/newlei/FDA。