Federated learning (FL) is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often involve inherent challenges such as partially labeled datasets, where not all clients possess expert annotations of all labels of interest, leaving large portions of unlabeled data unused. In this study, we conduct the largest federated cardiac CT imaging analysis to date, focusing on partially labeled datasets ($n=8,124$) of Transcatheter Aortic Valve Implantation (TAVI) patients over eight hospital clients. Transformer architectures, which are the major building blocks of current foundation models, have shown superior performance when trained on larger cohorts than traditional CNNs. However, when trained on small task-specific labeled sample sizes, it is currently not feasible to exploit their underlying attention mechanism for improved performance. Therefore, we developed a two-stage semi-supervised learning strategy that distills knowledge from several task-specific CNNs (landmark detection and segmentation of calcification) into a single transformer model by utilizing large amounts of unlabeled data typically residing unused in hospitals to mitigate these issues. This method not only improves the predictive accuracy and generalizability of transformer-based architectures but also facilitates the simultaneous learning of all partial labels within a single transformer model across the federation. Additionally, we show that our transformer-based model extracts more meaningful features for further downstream tasks than the UNet-based one by only training the last layer to also solve segmentation of coronary arteries. We make the code and weights of the final model openly available, which can serve as a foundation model for further research in cardiac CT imaging.
翻译:联邦学习(FL)是一种利用去中心化数据同时保护隐私的知名技术。然而,现实应用常面临固有挑战,例如部分标记数据集,即并非所有客户端都拥有所有关注标签的专家标注,导致大量未标记数据未被利用。本研究开展了迄今为止规模最大的联邦心脏CT成像分析,聚焦于八个医院客户端经导管主动脉瓣植入术(TAVI)患者的部分标记数据集(n=8,124)。Transformer架构作为当前基础模型的核心构件,在比传统CNN更大的队列上训练时已展现出更优性能。然而,当在特定任务的小规模标记样本上训练时,目前尚无法有效利用其底层注意力机制以提升性能。为此,我们开发了一种两阶段半监督学习策略,通过利用医院中通常闲置的大量未标记数据,将多个特定任务CNN(解剖标志点检测与钙化分割)的知识蒸馏至单个Transformer模型中,以缓解上述问题。该方法不仅提升了基于Transformer架构的预测准确性与泛化能力,还促进了联邦范围内所有部分标签在单一Transformer模型中的同步学习。此外,我们通过仅训练最后一层以同时解决冠状动脉分割任务,证明了基于Transformer的模型比基于UNet的模型能提取更具意义的特征用于后续下游任务。我们公开了最终模型的代码与权重,该模型可作为心脏CT成像领域进一步研究的基础模型。