Federated learning (FL) enables collaborative training of machine learning models while protecting the privacy of data. Traditional FL heavily relies on a trusted centralized server. It is vulnerable to poisoning attacks, the sharing of raw model updates puts the private training data under the risk of being reconstructed, and it suffers from an efficiency problem due to heavy communication cost. Although decentralized FL eliminates the central dependence, it may worsen the other problems due to insufficient constraints on the behavior of participants and distributed consensus on the global model update. In this paper, we propose a blockchain-based fully decentralized peer-to-peer (P2P) framework for FL, called BlockDFL for short. It leverages blockchain to force participants to behave well. It integrates gradient compression and our designed voting mechanism to coordinate decentralized FL among peer participants without mutual trust, while preventing data from being reconstructed from transmitted model updates. Extensive experiments conducted on two real-world datasets exhibit that BlockDFL obtains competitive accuracy compared to centralized FL and can defend poisoning attacks while achieving efficiency and scalability. Especially when the proportion of malicious participants is as high as 40%, BlockDFL can still preserve the accuracy of FL, outperforming existing fully decentralized FL frameworks based on blockchain.
翻译:联邦学习(FL)支持在保护数据隐私的同时协同训练机器学习模型。传统FL严重依赖可信的中央服务器,易受投毒攻击,原始模型更新的共享使私有训练数据面临被重构的风险,且因通信成本高昂而存在效率问题。尽管去中心化FL消除了中央依赖,但由于参与者行为约束不足及全局模型更新缺乏分布式共识,可能加剧其他问题。本文提出一种基于区块链的完全去中心化点对点(P2P)FL框架,简称BlockDFL。该框架利用区块链强制参与者规范行为,通过集成梯度压缩与所设计的投票机制,在无相互信任的参与者间协调去中心化FL,同时防止从传输的模型更新中重构数据。在两种真实数据集上的大量实验表明,与集中式FL相比,BlockDFL可获得具有竞争力的精度,能在实现高效性与可扩展性的同时抵御投毒攻击。尤其是当恶意参与者比例高达40%时,BlockDFL仍能保持FL精度,优于现有基于区块链的完全去中心化FL框架。