Current network training paradigms primarily focus on either centralized or decentralized data regimes. However, in practice, data availability often exhibits a hybrid nature, where both regimes coexist. This hybrid setting presents new opportunities for model training, as the two regimes offer complementary trade-offs: decentralized data is abundant but subject to heterogeneity and communication constraints, while centralized data, though limited in volume and potentially unrepresentative, enables better curation and high-throughput access. Despite its potential, effectively combining these paradigms remains challenging, and few frameworks are tailored to hybrid data regimes. To address this, we propose a novel framework that constructs a model atlas from decentralized models and leverages centralized data to refine a global model within this structured space. The refined model is then used to reinitialize the decentralized models. Our method synergizes federated learning (to exploit decentralized data) and model merging (to utilize centralized data), enabling effective training under hybrid data availability. Theoretically, we show that our approach achieves faster convergence than methods relying solely on decentralized data, due to variance reduction in the merging process. Extensive experiments demonstrate that our framework consistently outperforms purely centralized, purely decentralized, and existing hybrid-adaptable methods. Notably, our method remains robust even when the centralized and decentralized data domains differ or when decentralized data contains noise, significantly broadening its applicability.
翻译:当前网络训练范式主要集中于中心化或去中心化数据模式之一。然而在实际应用中,数据可用性常呈现混合特性,两种模式共存。这种混合场景为模型训练提供了新机遇,因为两种模式提供了互补的权衡:去中心化数据规模庞大但存在异构性与通信约束,而中心化数据虽在数量上有限且可能缺乏代表性,却支持更好的数据整理与高吞吐量访问。尽管潜力显著,有效整合这两种范式仍具挑战性,且鲜有专门针对混合数据模式的框架。为此,我们提出一种新颖框架:首先从去中心化模型中构建模型图谱,随后利用中心化数据在此结构化空间内优化全局模型,最终将优化后的模型重新初始化各去中心化模型。本方法协同融合联邦学习(以利用去中心化数据)与模型融合(以运用中心化数据),实现在混合数据可用性下的高效训练。理论分析表明,由于融合过程中的方差缩减,本方法比仅依赖去中心化数据的方法具有更快的收敛速度。大量实验证明,本框架在性能上持续优于纯中心化、纯去中心化及现有混合适应方法。值得注意的是,即使中心化与去中心化数据域存在差异或去中心化数据包含噪声,本方法仍保持鲁棒性,显著拓宽了其适用边界。