Privacy-Preserving Heterogeneous Federated Learning for Sensitive Healthcare Data

In the realm of healthcare where decentralized facilities are prevalent, machine learning faces two major challenges concerning the protection of data and models. The data-level challenge concerns the data privacy leakage when centralizing data with sensitive personal information. While the model-level challenge arises from the heterogeneity of local models, which need to be collaboratively trained while ensuring their confidentiality to address intellectual property concerns. To tackle these challenges, we propose a new framework termed Abstention-Aware Federated Voting (AAFV) that can collaboratively and confidentially train heterogeneous local models while simultaneously protecting the data privacy. This is achieved by integrating a novel abstention-aware voting mechanism and a differential privacy mechanism onto local models' predictions. In particular, the proposed abstention-aware voting mechanism exploits a threshold-based abstention method to select high-confidence votes from heterogeneous local models, which not only enhances the learning utility but also protects model confidentiality. Furthermore, we implement AAFV on two practical prediction tasks of diabetes and in-hospital patient mortality. The experiments demonstrate the effectiveness and confidentiality of AAFV in testing accuracy and privacy protection.

翻译：在医疗领域，分散的医疗机构普遍存在，机器学习在数据和模型保护方面面临两大挑战。数据层面的挑战涉及集中处理包含敏感个人信息的数据时可能导致的数据隐私泄露。而模型层面的挑战则源于本地模型的异构性，这些模型需要在确保其机密性以解决知识产权问题的同时进行协同训练。为应对这些挑战，我们提出了一种名为"弃权感知联邦投票"的新框架，该框架能够在保护数据隐私的同时，协同且保密地训练异构本地模型。这是通过将新颖的弃权感知投票机制和差分隐私机制集成到本地模型的预测中实现的。具体而言，所提出的弃权感知投票机制采用基于阈值的弃权方法，从异构本地模型中选择高置信度的投票，这不仅提升了学习效用，还保护了模型的机密性。此外，我们在糖尿病和院内患者死亡率两个实际预测任务上实现了AAFV。实验结果表明，AAFV在测试准确性和隐私保护方面具有有效性和保密性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/