Federated learning (FL) enables a collaborative environment for training machine learning models without sharing training data between users. This is typically achieved by aggregating model gradients on a central server. Decentralized federated learning is a rising paradigm that enables users to collaboratively train machine learning models in a peer-to-peer manner, without the need for a central aggregation server. However, before applying decentralized FL in real-world use training environments, nodes that deviate from the FL process (Byzantine nodes) must be considered when selecting an aggregation function. Recent research has focused on Byzantine-robust aggregation for client-server or fully connected networks, but has not yet evaluated such aggregation schemes for complex topologies possible with decentralized FL. Thus, the need for empirical evidence of Byzantine robustness in differing network topologies is evident. This work investigates the effects of state-of-the-art Byzantine-robust aggregation methods in complex, large-scale network structures. We find that state-of-the-art Byzantine robust aggregation strategies are not resilient within large non-fully connected networks. As such, our findings point the field towards the development of topology-aware aggregation schemes, especially necessary within the context of large scale real-world deployment.
翻译:联邦学习(FL)能够在不共享用户训练数据的前提下,为机器学习模型的训练提供协作环境。这一目标通常通过在中央服务器上聚合模型梯度来实现。去中心化联邦学习是一种新兴范式,它允许用户以点对点的方式协作训练机器学习模型,无需中央聚合服务器。然而,在将去中心化联邦学习应用于实际训练环境之前,选择聚合函数时必须考虑偏离联邦学习过程的节点(拜占庭节点)。近期研究主要关注面向客户端-服务器架构或全连接网络的拜占庭鲁棒聚合方案,但尚未评估此类方案在去中心化联邦学习可能采用的复杂拓扑结构中的表现。因此,获取不同网络拓扑下拜占庭鲁棒性的实证证据显得尤为必要。本研究探讨了先进拜占庭鲁棒聚合方法在复杂大规模网络结构中的效果。我们发现,当前最先进的拜占庭鲁棒聚合策略在大型非全连接网络中缺乏适应性。基于此,我们的研究指出该领域需要开发具有拓扑感知能力的聚合方案,这对于大规模实际部署场景尤为重要。