Private, fair and accurate: Training large-scale, privacy-preserving AI models in radiology

Artificial intelligence (AI) models are increasingly used in the medical domain. However, as medical data is highly sensitive, special precautions to ensure the protection of said data are required. The gold standard for privacy preservation is the introduction of differential privacy (DP) to model training. However, prior work has shown that DP has negative implications on model accuracy and fairness. Therefore, the purpose of this study is to demonstrate that the privacy-preserving training of AI models for chest radiograph diagnosis is possible with high accuracy and fairness compared to non-private training. N=193,311 high quality clinical chest radiographs were retrospectively collected and manually labeled by experienced radiologists, who assigned one or more of the following diagnoses: cardiomegaly, congestion, pleural effusion, pneumonic infiltration and atelectasis, to each side (where applicable). The non-private AI models were compared with privacy-preserving (DP) models with respect to privacy-utility trade-offs (measured as area under the receiver-operator-characteristic curve (AUROC)), and privacy-fairness trade-offs (measured as Pearson-R or Statistical Parity Difference). The non-private AI model achieved an average AUROC score of 0.90 over all labels, whereas the DP AI model with a privacy budget of epsilon=7.89 resulted in an AUROC of 0.87, i.e., a mere 2.6% performance decrease compared to non-private training. The privacy-preserving training of diagnostic AI models can achieve high performance with a small penalty on model accuracy and does not amplify discrimination against age, sex or co-morbidity. We thus encourage practitioners to integrate state-of-the-art privacy-preserving techniques into medical AI model development.

翻译：人工智能（AI）模型在医学领域的应用日益广泛。然而，由于医疗数据高度敏感，必须采取特殊措施确保数据保护。隐私保护的金标准是在模型训练中引入差分隐私（DP）。然而，既往研究表明DP会对模型准确性和公平性产生负面影响。因此，本研究旨在证明，在胸部X光片诊断的AI模型隐私保护训练中，可以实现与非隐私训练相当的高准确性和公平性。研究回顾性收集了N=193,311张高质量临床胸部X光片，并由经验丰富的放射科医生进行人工标注。医生根据每侧（适用时）诊断结果分配以下一种或多种诊断：心脏肥大、肺淤血、胸腔积液、肺炎性浸润和肺不张。将非隐私AI模型与隐私保护（DP）模型在隐私-效用权衡（以受试者工作特征曲线下面积（AUROC）衡量）和隐私-公平性权衡（以Pearson-R或统计奇偶性差异衡量）方面进行比较。非隐私AI模型在所有标签上的平均AUROC得分为0.90，而隐私预算为epsilon=7.89的DP AI模型AUROC为0.87，即与非隐私训练相比仅降低2.6%的性能。诊断AI模型的隐私保护训练可在对模型准确性造成较小损失的情况下实现高性能，且不会加剧对年龄、性别或合并症的歧视。我们因此鼓励从业者将最先进的隐私保护技术整合到医疗AI模型开发中。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日