We introduce a method, MMD-B-Fair, to learn fair representations of data via kernel two-sample testing. We find neural features of our data where a maximum mean discrepancy (MMD) test cannot distinguish between representations of different sensitive groups, while preserving information about the target attributes. Minimizing the power of an MMD test is more difficult than maximizing it (as done in previous work), because the test threshold's complex behavior cannot be simply ignored. Our method exploits the simple asymptotics of block testing schemes to efficiently find fair representations without requiring complex adversarial optimization or generative modelling schemes widely used by existing work on fair representation learning. We evaluate our approach on various datasets, showing its ability to ``hide'' information about sensitive attributes, and its effectiveness in downstream transfer tasks.
翻译:我们提出了一种名为MMD-B-Fair的方法,通过核双样本检验学习数据的公平表示。该方法在保留目标属性相关信息的同时,寻找数据的神经特征,使得最大均值差异检验无法区分不同敏感群体的表示。与以往工作中最大化检验势不同,最小化MMD检验势更为困难,因为检验阈值的复杂行为无法简单忽略。我们的方法利用分块检验方案的简单渐近性质,高效地寻找公平表示,无需采用现有公平表示学习工作中广泛使用的复杂对抗优化或生成式建模方案。我们在多个数据集上评估了该方法,展示了其“隐藏”敏感属性信息的能力,以及在下游迁移任务中的有效性。