Federated Learning (FL) aims to train a machine learning (ML) model in a distributed fashion to strengthen data privacy with limited data migration costs. It is a distributed learning framework naturally suitable for privacy-sensitive medical imaging datasets. However, most current FL-based medical imaging works assume silos have ground truth labels for training. In practice, label acquisition in the medical field is challenging as it often requires extensive labor and time costs. To address this challenge and leverage the unannotated data silos to improve modeling, we propose an alternate training-based framework, Federated Alternate Training (FAT), that alters training between annotated data silos and unannotated data silos. Annotated data silos exploit annotations to learn a reasonable global segmentation model. Meanwhile, unannotated data silos use the global segmentation model as a target model to generate pseudo labels for self-supervised learning. We evaluate the performance of the proposed framework on two naturally partitioned Federated datasets, KiTS19 and FeTS2021, and show its promising performance.
翻译:联邦学习(FL)旨在以分布式方式训练机器学习(ML)模型,在降低数据迁移成本的同时增强数据隐私保护。作为一种天然适用于隐私敏感型医学影像数据集的分布式学习框架,当前大多数基于FL的医学影像研究假设各数据孤岛已配备训练用的真实标注。然而在医学领域,获取标注数据面临重大挑战——通常需要耗费大量人力与时间成本。为解决该问题并利用未标注数据孤岛优化建模,我们提出基于交替训练的联邦交替训练(FAT)框架,该框架在标注数据孤岛与未标注数据孤岛间交替进行训练。标注数据孤岛利用标注信息学习合理的全局分割模型,而未标注数据孤岛则以全局分割模型作为目标模型生成伪标签进行自监督学习。我们在两个自然划分的联邦数据集KiTS19和FeTS2021上评估了该框架的性能,实验结果表明其具有显著效果。