One of the biggest challenges of building artificial intelligence (AI) model in the healthcare area is the data sharing. Since healthcare data is private, sensitive, and heterogeneous, collecting sufficient data for modelling is exhausting, costly, and sometimes impossible. In this paper, we propose a framework for global healthcare modelling using datasets from multi-continents (Europe, North America, and Asia) without sharing the local datasets, and choose glucose management as a study model to verify its effectiveness. Technically, blockchain-enabled federated learning is implemented with adaptation to meet the privacy and safety requirements of healthcare data, meanwhile, it rewards honest participation and penalizes malicious activities using its on-chain incentive mechanism. Experimental results show that the proposed framework is effective, efficient, and privacy-preserving. Its prediction accuracy consistently outperforms models trained on limited personal data and achieves comparable or even slightly better results than centralized training in certain scenarios, all while preserving data privacy. This work paves the way for international collaborations on healthcare projects, where additional data is crucial for reducing bias and providing benefits to humanity.
翻译:在医疗健康领域构建人工智能模型面临的最大挑战之一是数据共享问题。由于医疗数据具有隐私性、敏感性和异构性,收集足够数据进行建模往往耗时耗力、成本高昂,有时甚至无法实现。本文提出一个利用多洲际(欧洲、北美和亚洲)数据集进行全球医疗建模的框架,该框架无需共享本地数据集,并以血糖管理为研究模型验证其有效性。在技术实现上,采用基于区块链的联邦学习方案并进行适应性改进,以满足医疗数据的隐私与安全要求,同时通过链上激励机制奖励诚实参与并惩罚恶意行为。实验结果表明,所提框架具有高效性、有效性和隐私保护性。其预测精度始终优于基于有限个人数据训练的模型,在某些场景下甚至达到或略优于集中式训练的结果,同时全程保障数据隐私。这项工作为医疗健康项目的国际合作开辟了新途径,其中补充数据对于减少偏差和造福人类至关重要。