Privacy-preserving model co-training in medical research is often hindered by server-dependent architectures incompatible with protected hospital data systems and by the predominant focus on relative effect measures (hazard ratios) which lack clinical interpretability for absolute survival risk assessment. We propose FedRD, a communication-efficient framework for federated risk difference estimation in distributed survival data. Unlike typical federated learning frameworks (e.g., FedAvg) that require persistent server connections and extensive iterative communication, FedRD is server-independent with minimal communication: one round of summary statistics exchange for the stratified model and three rounds for the unstratified model. Crucially, FedRD provides valid confidence intervals and hypothesis testing--capabilities absent in FedAvg-based frameworks. We provide theoretical guarantees by establishing the asymptotic properties of FedRD and prove that FedRD (unstratified) is asymptotically equivalent to pooled individual-level analysis. Simulation studies and real-world clinical applications across different countries demonstrate that FedRD outperforms local and federated baselines in both estimation accuracy and prediction performance, providing an architecturally feasible solution for absolute risk assessment in privacy-restricted, multi-site clinical studies.
翻译:医学研究中的隐私保护模型协同训练常受限于两种因素:依赖服务器的架构与受保护的医院数据系统不兼容,以及过度关注相对效应指标(风险比)而缺乏对绝对生存风险评估的临床可解释性。本文提出FedRD,一个面向分布式生存数据的联邦风险差异估计的通信高效框架。与典型联邦学习框架(如FedAvg)需要持续服务器连接和大量迭代通信不同,FedRD具有服务器独立性且通信开销极低:分层模型仅需一轮汇总统计量交换,非分层模型仅需三轮。关键的是,FedRD能提供有效的置信区间和假设检验——这是基于FedAvg的框架所不具备的能力。我们通过建立FedRD的渐近性质提供了理论保证,并证明FedRD(非分层)在渐近意义上等价于集中式个体水平分析。跨不同国家的模拟研究和真实世界临床应用表明,FedRD在估计精度和预测性能上均优于局部基线及联邦基线,为隐私受限的多中心临床研究中的绝对风险评估提供了一种架构上可行的解决方案。