Medical imaging models have been shown to encode information about patient demographics such as age, race, and sex in their latent representation, raising concerns about their potential for discrimination. Here, we ask whether requiring models not to encode demographic attributes is desirable. We point out that marginal and class-conditional representation invariance imply the standard group fairness notions of demographic parity and equalized odds, respectively, while additionally requiring risk distribution matching, thus potentially equalizing away important group differences. Enforcing the traditional fairness notions directly instead does not entail these strong constraints. Moreover, representationally invariant models may still take demographic attributes into account for deriving predictions. The latter can be prevented using counterfactual notions of (individual) fairness or invariance. We caution, however, that properly defining medical image counterfactuals with respect to demographic attributes is highly challenging. Finally, we posit that encoding demographic attributes may even be advantageous if it enables learning a task-specific encoding of demographic features that does not rely on social constructs such as 'race' and 'gender.' We conclude that demographically invariant representations are neither necessary nor sufficient for fairness in medical imaging. Models may need to encode demographic attributes, lending further urgency to calls for comprehensive model fairness assessments in terms of predictive performance across diverse patient groups.
翻译:医学影像模型已被证明在其潜在表示中编码患者的人群统计信息(如年龄、种族和性别),这引发了对它们可能产生歧视的担忧。在此,我们探讨要求模型不编码人群属性是否可取。我们指出,边际和类别条件表示不变性分别蕴含了标准群体公平概念(即人口统计均等和机会均等),同时还要求风险分布匹配,因此可能抹平重要的群体差异。直接强制执行传统公平概念则不会施加这些严格约束。此外,表示不变性模型仍可能利用人群属性进行预测推导。后者可以通过反事实(个体)公平性或不变性概念来防止。然而,我们提醒,正确定义与人群属性相关的医学影像反事实极其困难。最后,我们提出,编码人群属性甚至可能是有利的——如果它能通过不依赖"种族"和"性别"等社会建构概念的方式,学习人群特征的任务特定编码。我们的结论是:人群不变性表示既非医学影像公平性的必要条件也非充分条件。模型可能确实需要编码人群属性,这进一步凸显了在多元化患者群体中开展全面模型公平性评估(基于预测性能)的紧迫性。