We introduce a method, MMD-B-Fair, to learn fair representations of data via kernel two-sample testing. We find neural features of our data where a maximum mean discrepancy (MMD) test cannot distinguish between representations of different sensitive groups, while preserving information about the target attributes. Minimizing the power of an MMD test is more difficult than maximizing it (as done in previous work), because the test threshold's complex behavior cannot be simply ignored. Our method exploits the simple asymptotics of block testing schemes to efficiently find fair representations without requiring complex adversarial optimization or generative modelling schemes widely used by existing work on fair representation learning. We evaluate our approach on various datasets, showing its ability to ``hide'' information about sensitive attributes, and its effectiveness in downstream transfer tasks.
翻译:摘要:本文提出一种名为MMD-B-Fair的方法,通过核双样本检验学习数据的公平表征。我们在数据中寻找这样的神经特征:当采用最大均值差异(MMD)检验时,不同敏感群体的表征之间无法被区分,同时保留目标属性的信息。相较于最大化检验势(如先前工作),最小化MMD检验势更具挑战性,因为检验阈值的复杂行为无法被简单忽略。本方法利用分块检验方案的简单渐近性质,高效地找到公平表征,无需采用现有公平表征学习工作中广泛使用的复杂对抗优化或生成建模方案。我们在多个数据集上评估了该方法,展示了其"隐藏"敏感属性信息的能力,以及在下游迁移任务中的有效性。