Learning from the collective knowledge of data dispersed across private sources can provide neural networks with enhanced generalization capabilities. Federated learning, a method for collaboratively training a machine learning model across remote clients, achieves this by combining client models via the orchestration of a central server. However, current approaches face two critical limitations: i) they struggle to converge when client domains are sufficiently different, and ii) current aggregation techniques produce an identical global model for each client. In this work, we address these issues by reformulating the typical federated learning setup: rather than learning a single global model, we learn N models each optimized for a common objective. To achieve this, we apply a weighted distance minimization to model parameters shared in a peer-to-peer topology. The resulting framework, Iterative Parameter Alignment, applies naturally to the cross-silo setting, and has the following properties: (i) a unique solution for each participant, with the option to globally converge each model in the federation, and (ii) an optional early-stopping mechanism to elicit fairness among peers in collaborative learning settings. These characteristics jointly provide a flexible new framework for iteratively learning from peer models trained on disparate datasets. We find that the technique achieves competitive results on a variety of data partitions compared to state-of-the-art approaches. Further, we show that the method is robust to divergent domains (i.e. disjoint classes across peers) where existing approaches struggle.
翻译:从分散在私有数据源中的集体知识中学习,能够增强神经网络的泛化能力。联邦学习作为一种跨远程客户端协作训练机器学习模型的方法,通过中央服务器的协调聚合各客户端模型实现这一目标。然而,当前方法存在两个关键局限:其一,当客户端领域差异显著时难以收敛;其二,现有聚合技术为每个客户端生成相同的全局模型。本文通过重构典型联邦学习框架解决上述问题:我们不再学习单个全局模型,而是学习N个针对共同目标优化的模型。为此,我们对点对点拓扑结构中共享的模型参数施加加权距离最小化。所提出的迭代参数对齐框架天然适用于跨孤岛场景,具有以下特性:(i) 为每个参与者提供唯一解,同时支持联邦中所有模型全局收敛的选项;(ii) 可选的早停机制,用于在协作学习环境中实现同伴间的公平性。这些特性共同构建了一个灵活的新框架,可迭代地从不同数据集训练的同伴模型中学习。实验表明,该方法在多种数据划分场景下相较于现有最优方法具有竞争力。此外,我们证明该方法对现有方法难以处理的异构域(即客户端间标签类别不重叠)具有鲁棒性。