We investigate the prominent class of fair representation learning methods for bias mitigation. Using causal reasoning to define and formalise different sources of dataset bias, we reveal important implicit assumptions inherent to these methods. We prove fundamental limitations on fair representation learning when evaluation data is drawn from the same distribution as training data and run experiments across a range of medical modalities to examine the performance of fair representation learning under distribution shifts. Our results explain apparent contradictions in the existing literature and reveal how rarely considered causal and statistical aspects of the underlying data affect the validity of fair representation learning. We raise doubts about current evaluation practices and the applicability of fair representation learning methods in performance-sensitive settings. We argue that fine-grained analysis of dataset biases should play a key role in the field moving forward.
翻译:本研究探讨了用于缓解偏差的主流公平表征学习方法。通过因果推理定义并形式化数据集偏差的不同来源,我们揭示了这些方法固有的重要隐含假设。我们证明了当评估数据与训练数据来自相同分布时,公平表征学习存在根本性局限,并通过跨多种医学模态的实验检验了分布偏移下公平表征学习的性能。我们的研究结果解释了现有文献中明显的矛盾,揭示了底层数据中极少被考虑的因果与统计特性如何影响公平表征学习的有效性。我们对当前评估实践及公平表征学习方法在性能敏感场景中的适用性提出质疑,主张对数据集偏差进行细粒度分析应成为该领域未来发展的关键方向。