Cross-Silo Federated Learning Across Divergent Domains with Iterative Parameter Alignment

Learning from the collective knowledge of data dispersed across private sources can provide neural networks with enhanced generalization capabilities. Federated learning, a method for collaboratively training a machine learning model across remote clients, achieves this by combining client models via the orchestration of a central server. However, current approaches face two critical limitations: i) they struggle to converge when client domains are sufficiently different, and ii) current aggregation techniques produce an identical global model for each client. In this work, we address these issues by reformulating the typical federated learning setup: rather than learning a single global model, we learn N models each optimized for a common objective. To achieve this, we apply a weighted distance minimization to model parameters shared in a peer-to-peer topology. The resulting framework, Iterative Parameter Alignment, applies naturally to the cross-silo setting, and has the following properties: (i) a unique solution for each participant, with the option to globally converge each model in the federation, and (ii) an optional early-stopping mechanism to elicit fairness among peers in collaborative learning settings. These characteristics jointly provide a flexible new framework for iteratively learning from peer models trained on disparate datasets. We find that the technique achieves competitive results on a variety of data partitions compared to state-of-the-art approaches. Further, we show that the method is robust to divergent domains (i.e. disjoint classes across peers) where existing approaches struggle.

翻译：从分散在私有来源中的数据集体知识中学习，能够赋予神经网络更强的泛化能力。联邦学习作为一种在远程客户端间协作训练机器学习模型的方法，通过中央服务器协调聚合客户端模型来实现这一目标。然而，现有方法面临两个关键局限性：i) 当客户端领域差异足够大时难以收敛，ii) 当前聚合技术为每个客户端生成完全相同的全局模型。针对这些问题，本研究通过重新定义典型联邦学习框架来解决：我们不再学习单一全局模型，而是学习N个针对共同目标分别优化的模型。具体实现上，我们对点对点拓扑结构中共享的模型参数施加加权距离最小化。由此产生的框架——迭代参数对齐——天然适用于跨孤岛场景，并具有以下特性：(i) 为每个参与者提供唯一解，同时允许联邦中所有模型全局收敛；(ii) 可选早停机制以实现协作学习场景中同伴间的公平性。这些特性共同构建了一个灵活的新框架，用于从不同数据集训练的同伴模型中迭代学习。实验表明，该技术在多种数据划分场景下相较现有最优方法均取得竞争性结果。进一步研究发现，该方法对现有方法难以处理的异构领域（即同伴间类别不重叠）具有鲁棒性。