A fair classifier should ensure the benefit of people from different groups, while the group information is often sensitive and unsuitable for model training. Therefore, learning a fair classifier but excluding sensitive attributes in the training dataset is important. In this paper, we study learning fair classifiers without implementing fair training algorithms to avoid possible leakage of sensitive information. Our theoretical analyses validate the possibility of this approach, that traditional training on a dataset with an appropriate distribution shift can reduce both the upper bound for fairness disparity and model generalization error, indicating that fairness and accuracy can be improved simultaneously with simply traditional training. We then propose a tractable solution to progressively shift the original training data during training by sampling influential data, where the sensitive attribute of new data is not accessed in sampling or used in training. Extensive experiments on real-world data demonstrate the effectiveness of our proposed algorithm.
翻译:公平分类器应确保不同群体的人群获益,而群体信息通常具有敏感性且不适合用于模型训练。因此,在学习公平分类器的同时排除训练数据集中的敏感属性具有重要意义。本文研究如何在不实施公平训练算法的情况下学习公平分类器,以避免敏感信息的潜在泄露。我们的理论分析验证了该方法的可行性:对具有适当分布偏移的数据集进行传统训练,既能降低公平性差异的上界,又能减少模型泛化误差,表明传统训练可同时提升公平性与准确性。随后,我们提出一种可行方案,通过采样具有影响力的数据逐步调整原始训练数据,其中新数据的敏感属性在采样过程中不可获取,且不参与训练。在真实数据集上的大量实验证明了所提算法的有效性。