In this study, we automate quantitative mammographic breast density estimation with neural networks and show that this tool is a strong use case for federated learning on multi-institutional datasets. Our dataset included bilateral CC-view and MLO-view mammographic images from two separate institutions. Two U-Nets were separately trained on algorithm-generated labels to perform segmentation of the breast and dense tissue from these images and subsequently calculate breast percent density (PD). The networks were trained with federated learning and compared to three non-federated baselines, one trained on each single-institution dataset and one trained on the aggregated multi-institution dataset. We demonstrate that training on multi-institution datasets is critical to algorithm generalizability. We further show that federated learning on multi-institutional datasets improves model generalization to unseen data at nearly the same level as centralized training on multi-institutional datasets, indicating that federated learning can be applied to our method to improve algorithm generalizability while maintaining patient privacy.
翻译:本研究利用神经网络实现了定量乳腺钼靶密度估计的自动化,并证明该方法可作为联邦学习在多机构数据集上的典型应用案例。数据集包含来自两家独立医疗机构的双侧头尾位和内外斜位钼靶影像。我们分别训练了两个U-Net模型,利用算法生成的标签对乳腺区域及致密组织进行分割,进而计算乳腺百分比密度(PD)。模型采用联邦学习策略进行训练,并与三种非联邦基线方法进行对比——其中两种分别基于单机构数据集训练,另一种基于合并的多机构数据集训练。结果表明,多机构数据集的训练对算法泛化能力至关重要。进一步研究发现,基于多机构数据集的联邦学习能够使模型在未见数据上的泛化性能接近多机构集中训练水平,这表明联邦学习可在保障患者隐私的前提下有效提升算法的泛化能力。