Federated Learning has gained popularity among medical institutions since it enables collaborative training between clients (e.g., hospitals) without aggregating data. However, due to the high cost associated with creating annotations, especially for large 3D image datasets, clinical institutions do not have enough supervised data for training locally. Thus, the performance of the collaborative model is subpar under limited supervision. On the other hand, large institutions have the resources to compile data repositories with high-resolution images and labels. Therefore, individual clients can utilize the knowledge acquired in the public data repositories to mitigate the shortage of private annotated images. In this paper, we propose a federated few-shot learning method with dual knowledge distillation. This method allows joint training with limited annotations across clients without jeopardizing privacy. The supervised learning of the proposed method extracts features from limited labeled data in each client, while the unsupervised data is used to distill both feature and response-based knowledge from a national data repository to further improve the accuracy of the collaborative model and reduce the communication cost. Extensive evaluations are conducted on 3D magnetic resonance knee images from a private clinical dataset. Our proposed method shows superior performance and less training time than other semi-supervised federated learning methods. Codes and additional visualization results are available at https://github.com/hexiaoxiao-cs/fedml-knee.
翻译:联邦学习因能够在无需聚合数据的情况下实现客户端(如医院)间的协同训练,已在医疗机构中受到广泛关注。然而,由于标注成本高昂(尤其是大规模3D图像数据集),临床机构缺乏足够的监督数据用于本地训练。因此,在有限监督条件下,协同模型的性能表现欠佳。另一方面,大型机构具备构建高分辨率图像与标签数据存储库的资源。因此,个体客户端可利用公共数据存储库中获取的知识来缓解私有标注图像不足的问题。本文提出一种基于双知识蒸馏的联邦小样本学习方法。该方法允许客户端在保护隐私的前提下,利用有限标注数据进行联合训练。所提方法的监督学习部分从各客户端有限的标注数据中提取特征,而非监督数据则用于从国家数据存储库中蒸馏基于特征和响应的知识,从而进一步提升协同模型精度并降低通信成本。基于私有临床数据集的3D膝关节磁共振图像开展了广泛评估。与其他半监督联邦学习方法相比,本文方法展现出更优的性能和更短的训练时间。代码及更多可视化结果详见 https://github.com/hexiaoxiao-cs/fedml-knee。