Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the structure of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is encountering in practice. In this work, we adopt a causal framing to motivate conditional independence tests as a key tool for characterizing distribution shifts. Using our approach in two medical applications, we show that this knowledge can help diagnose failures of fairness transfer, including cases where real-world shifts are more complex than is often assumed in the literature. Based on these results, we discuss potential remedies at each step of the machine learning pipeline.
翻译:诊断并缓解模型在分布偏移下的公平性变化,是机器学习在医疗领域安全部署的重要环节。关键在于,任何缓解策略的成功与否都强烈依赖于偏移的结构。尽管如此,目前鲜有研究探讨如何实证评估实践中遇到的分布偏移结构。在本工作中,我们采用因果框架,将条件独立性检验作为刻画分布偏移的关键工具。通过将方法应用于两个医疗场景,我们表明此类知识有助于诊断公平性转移失效——包括文献中常被简化、实际却更为复杂的现实偏移情况。基于这些发现,我们讨论了机器学习流程各阶段的潜在补救措施。