The rapid adoption of data-driven methods in biomedicine has intensified concerns over privacy, governance, and regulation, limiting raw data sharing and hindering the assembly of representative cohorts for clinically relevant AI. This landscape necessitates practical, efficient privacy solutions, as cryptographic defenses often impose heavy overhead and differential privacy can degrade performance, leading to sub-optimal outcomes in real-world settings. Here, we present a lightweight federated learning method, INFL, based on Implicit Neural Representations that addresses these challenges. Our approach integrates plug-and-play, coordinate-conditioned modules into client models, embeds a secret key directly into the architecture, and supports seamless aggregation across heterogeneous sites. Across diverse biomedical omics tasks, including cohort-scale classification in bulk proteomics, regression for perturbation prediction in single-cell transcriptomics, and clustering in spatial transcriptomics and multi-omics with both public and private data, we demonstrate that INFL achieves strong, controllable privacy while maintaining utility, preserving the performance necessary for downstream scientific and clinical applications.
翻译:生物医学领域对数据驱动方法的快速采纳加剧了对隐私、治理和监管的担忧,限制了原始数据共享,并阻碍了为临床相关AI建立代表性队列。这一现状需要实用且高效的隐私解决方案,因为密码学防御手段常带来沉重开销,而差分隐私可能导致性能下降,在真实场景中产生次优结果。本文基于隐式神经表征提出了一种轻量级联邦学习方法INFL,以应对这些挑战。该方法将即插即用的坐标条件模块集成至客户端模型,直接将密钥嵌入架构,并支持跨异质站点的无缝聚合。在多种生物组学任务中(包括批量蛋白质组学的队列级分类、单细胞转录组学的扰动预测回归、空间转录组学及多组学的聚类分析,涉及公开与私有数据),我们证明INFL能够在保持效用性的同时实现强可控的隐私保护,从而维持下游科学与临床应用所需的性能。