Federated learning (FL) enables collaborative training across organizations without sharing raw data, but it is hindered by statistical heterogeneity (non-i.i.d.\ client data) and by instability of naive weight averaging under client drift. In many cross-silo deployments, FL is warm-started from a strong pretrained backbone (e.g., ImageNet-1K) and then adapted to local domains. Motivated by recent evidence that ReLU-like gating regimes (structural knowledge) stabilize earlier than the remaining parameter values (quantitative knowledge), we propose FedSQ (Federated Structural-Quantitative learning), a transfer-initialized neural federated procedure based on a DualCopy, piecewise-linear view of deep networks. FedSQ freezes a structural copy of the pretrained model to induce fixed binary gating masks during federated fine-tuning, while only a quantitative copy is optimized locally and aggregated across rounds. Fixing the gating reduces learning to within-regime affine refinements, which stabilizes aggregation under heterogeneous partitions. Experiments on two convolutional neural network backbones under i.i.d.\ and Dirichlet splits show that FedSQ improves robustness and can reduce rounds-to-best validation performance relative to standard baselines while preserving accuracy in the transfer setting.
翻译:联邦学习(Federated Learning, FL)能够在无需共享原始数据的情况下实现跨组织协同训练,但统计异质性(非独立同分布客户端数据)以及客户端漂移下朴素加权平均的不稳定性对其构成阻碍。在许多跨机构部署场景中,联邦学习从强预训练骨干网络(如ImageNet-1K)进行热启动,随后适配至本地领域。基于近期证据表明,类ReLU门控机制(结构知识)比其余参数值(定量知识)更早趋于稳定,我们提出FedSQ(联邦结构-定量学习),这是一种基于深度网络“双副本、分段线性视角”的迁移初始化神经联邦流程。FedSQ在联邦微调过程中冻结预训练模型的结构副本以生成固定二元门控掩码,仅优化定量副本并在各轮次间进行聚合。固定门控将学习过程简化为域内仿射微调,从而在异质性数据划分下稳定聚合过程。在独立同分布与狄利克雷划分两种卷积神经网络骨干上的实验表明:FedSQ相较于标准基线方法提升了鲁棒性,可在转移场景下保持精度的同时减少达到最优验证性能所需的通信轮次。