Fairness in artificial intelligence models has gained significantly more attention in recent years, especially in the area of medicine, as fairness in medical models is critical to people's well-being and lives. High-quality medical fairness datasets are needed to promote fairness learning research. Existing medical fairness datasets are all for classification tasks, and no fairness datasets are available for medical segmentation, while medical segmentation is an equally important clinical task as classifications, which can provide detailed spatial information on organ abnormalities ready to be assessed by clinicians. In this paper, we propose the first fairness dataset for medical segmentation named Harvard-FairSeg with 10,000 subject samples. In addition, we propose a fair error-bound scaling approach to reweight the loss function with the upper error-bound in each identity group, using the segment anything model (SAM). We anticipate that the segmentation performance equity can be improved by explicitly tackling the hard cases with high training errors in each identity group. To facilitate fair comparisons, we utilize a novel equity-scaled segmentation performance metric to compare segmentation metrics in the context of fairness, such as the equity-scaled Dice coefficient. Through comprehensive experiments, we demonstrate that our fair error-bound scaling approach either has superior or comparable fairness performance to the state-of-the-art fairness learning models. The dataset and code are publicly accessible via https://ophai.hms.harvard.edu/datasets/harvard-fairseg10k.
翻译:近年来,人工智能模型的公平性在医学领域受到广泛关注,因为医学模型的公平性对人们的健康与生命至关重要。推动公平学习研究需要高质量医学公平数据集。现有医学公平数据集均针对分类任务,尚无适用于医学分割的公平数据集,而医学分割与分类同为重要的临床任务,能够提供可供临床医生评估的器官异常详细空间信息。本文提出了首个面向医学分割的公平数据集Harvard-FairSeg,包含10,000例受试样本。此外,我们提出了一种公平误差边界缩放方法,通过在各身份组中利用上限误差边界重新加权损失函数,并采用分割一切模型(SAM)。通过显式处理各身份组中训练误差较高的困难样本,预期可提升分割性能的公平性。为促进公平比较,我们采用新型公平尺度分割性能指标(如公平尺度Dice系数)在公平性背景下评估分割指标。综合实验表明,本方法在公平性性能上优于或媲美现有最先进公平学习模型。数据集与代码通过https://ophai.hms.harvard.edu/datasets/harvard-fairseg10k公开获取。