Fairness in artificial intelligence models has gained significantly more attention in recent years, especially in the area of medicine, as fairness in medical models is critical to people's well-being and lives. High-quality medical fairness datasets are needed to promote fairness learning research. Existing medical fairness datasets are all for classification tasks, and no fairness datasets are available for medical segmentation, while medical segmentation is an equally important clinical task as classifications, which can provide detailed spatial information on organ abnormalities ready to be assessed by clinicians. In this paper, we propose the first fairness dataset for medical segmentation named Harvard-FairSeg with 10,000 subject samples. In addition, we propose a fair error-bound scaling approach to reweight the loss function with the upper error-bound in each identity group, using the segment anything model (SAM). We anticipate that the segmentation performance equity can be improved by explicitly tackling the hard cases with high training errors in each identity group. To facilitate fair comparisons, we utilize a novel equity-scaled segmentation performance metric to compare segmentation metrics in the context of fairness, such as the equity-scaled Dice coefficient. Through comprehensive experiments, we demonstrate that our fair error-bound scaling approach either has superior or comparable fairness performance to the state-of-the-art fairness learning models. The dataset and code are publicly accessible via https://ophai.hms.harvard.edu/harvard-fairseg10k.
翻译:人工智能模型的公平性近年来受到广泛关注,尤其在医学领域,因为医学模型的公平性直接关系到人们的健康与生命安全。高质量医学公平性数据集对推动公平性学习研究至关重要。现有医学公平性数据集均面向分类任务,尚无针对医学分割的公平性数据集,而医学分割作为与分类同等重要的临床任务,能够为临床医生提供器官异常的详细空间信息以供评估。本文提出首个面向医学分割的公平性数据集——哈佛-FairSeg,包含10,000个受试样本。此外,我们提出一种公平误差界缩放方法,通过在每个身份群体中基于上界误差重构损失函数,并结合分割一切模型(SAM)实现。我们预期,通过显式处理各身份群体中训练误差较高的困难样本,可提升分割性能的公平性。为促进公平比较,我们采用新型公平缩放分割性能指标(如公平缩放Dice系数)来评估公平性背景下的分割指标。综合实验表明,所提出的公平误差界缩放方法在公平性上优于或可与最先进的公平性学习模型相媲美。数据集与代码已通过https://ophai.hms.harvard.edu/harvard-fairseg10k公开访问。