Feasibility of Federated Learning from Client Databases with Different Brain Diseases and MRI Modalities

Segmentation models for brain lesions in MRI are typically developed for a specific disease and trained on data with a predefined set of MRI modalities. Such models cannot segment the disease using data with a different set of MRI modalities, nor can they segment other types of diseases. Moreover, this training paradigm prevents a model from using the advantages of learning from heterogeneous databases that may contain scans and segmentation labels for different brain pathologies and diverse sets of MRI modalities. Additionally, the confidentiality of patient data often prevents central data aggregation, necessitating a decentralized approach. Is it feasible to use Federated Learning (FL) to train a single model on client databases that contain scans and labels of different brain pathologies and diverse sets of MRI modalities? We demonstrate promising results by combining appropriate, simple, and practical modifications to the model and training strategy: Designing a model with input channels that cover the whole set of modalities available across clients, training with random modality drop, and exploring the effects of feature normalization methods. Evaluation on 7 brain MRI databases with 5 different diseases shows that this FL framework can train a single model achieving very promising results in segmenting all disease types seen during training. Importantly, it can segment these diseases in new databases that contain sets of modalities different from those in training clients. These results demonstrate, for the first time, the feasibility and effectiveness of using FL to train a single 3D segmentation model on decentralised data with diverse brain diseases and MRI modalities, a necessary step towards leveraging heterogeneous real-world databases. Code: https://github.com/FelixWag/FedUniBrain

翻译：针对MRI脑部病灶分割模型通常针对特定疾病开发，并在预定义MRI模态数据集上训练。此类模型既无法利用不同MRI模态组合的数据分割该疾病，也无法分割其他类型疾病。此外，这种训练范式阻碍了模型从异构数据库学习中获益的可能性——这些数据库可能包含不同脑部病理的扫描数据、分割标签以及多样化的MRI模态组合。同时，患者数据的保密性常阻碍中心化数据聚合，因此需要采用去中心化方法。本文探讨了使用联邦学习（FL）在包含不同脑部病理扫描数据、标签及多样化MRI模态组合的客户端数据库上训练单一模型的可行性。我们通过对模型架构与训练策略进行恰当、简洁且实用的改进，展示了具有前景的研究成果：设计覆盖所有客户端可用模态的输入通道模型，采用随机模态丢弃训练策略，并探索特征归一化方法的影响。在包含5种不同疾病的7个脑部MRI数据库上的评估表明，该FL框架能够训练出单一模型，对训练期间观察到的所有疾病类型均能实现极具前景的分割效果。值得注意的是，该模型能够在包含与训练客户端不同模态组合的新数据库中分割这些疾病。这些成果首次证明了使用FL在具有多样化脑部疾病与MRI模态的去中心化数据上训练单一三维分割模型的可行性与有效性，这是利用异构现实世界数据库的关键一步。代码地址：https://github.com/FelixWag/FedUniBrain

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日