Fairness in machine learning is important for societal well-being, but limited public datasets hinder its progress. Currently, no dedicated public medical datasets with imaging data for fairness learning are available, though minority groups suffer from more health issues. To address this gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal nerve disease dataset with both 2D and 3D imaging data and balanced racial groups for glaucoma detection. Glaucoma is the leading cause of irreversible blindness globally with Blacks having doubled glaucoma prevalence than other races. We also propose a fair identity normalization (FIN) approach to equalize the feature importance between different identity groups. Our FIN approach is compared with various the-state-of-the-arts fairness learning methods with superior performance in both racial and gender fairness tasks with 2D and 3D imaging data, which demonstrate the utilities of our dataset Harvard-GF for fairness learning. To facilitate fairness comparisons between different models, we propose an equity-scaled performance measure, which can be flexibly used to compare all kinds of performance metrics in the context of fairness. The dataset and code are publicly accessible via https://doi.org/10.7910/DVN/A4XMO1 and https://github.com/luoyan407/Harvard-GF, respectively.
翻译:机器学习中的公平性对社会福祉至关重要,但公共数据集的匮乏阻碍了其进展。尽管少数群体面临更多健康问题,但目前尚无专门用于公平学习的、包含影像数据的公共医学数据集。为填补这一空白,我们提出了哈佛青光眼公平性(Harvard-GF)数据集,该数据集包含二维和三维影像数据,并按种族均衡分组用于青光眼检测。青光眼是全球不可逆失明的主要原因,而黑种人的青光眼患病率是其他种族的2倍。我们还提出了一种公平身份归一化(FIN)方法,以均衡不同身份群体间的特征重要性。我们将FIN方法与多种最先进的公平学习方法进行了比较,在基于二维和三维影像数据的种族与性别公平性任务中均展现出优越性能,验证了Harvard-GF数据集在公平学习中的实用价值。为便于不同模型间的公平性比较,我们提出了一种公平缩放性能度量指标,可灵活应用于公平场景下各类性能指标的对比。数据集和代码分别通过https://doi.org/10.7910/DVN/A4XMO1 和 https://github.com/luoyan407/Harvard-GF 公开获取。