Federated Learning (FL) emerges as a distributed machine learning paradigm without end-user data transmission, effectively avoiding privacy leakage. Participating devices in FL are usually bandwidth-constrained, and the uplink is much slower than the downlink in wireless networks, which causes a severe uplink communication bottleneck. A prominent direction to alleviate this problem is federated dropout, which drops fractional weights of local models. However, existing federated dropout studies focus on random or ordered dropout and lack theoretical support, resulting in unguaranteed performance. In this paper, we propose Federated learning with Bayesian Inference-based Adaptive Dropout (FedBIAD), which regards weight rows of local models as probability distributions and adaptively drops partial weight rows based on importance indicators correlated with the trend of local training loss. By applying FedBIAD, each client adaptively selects a high-quality dropping pattern with accurate approximations and only transmits parameters of non-dropped weight rows to mitigate uplink costs while improving accuracy. Theoretical analysis demonstrates that the convergence rate of the average generalization error of FedBIAD is minimax optimal up to a squared logarithmic factor. Extensive experiments on image classification and next-word prediction show that compared with status quo approaches, FedBIAD provides 2x uplink reduction with an accuracy increase of up to 2.41% even on non-Independent and Identically Distributed (non-IID) data, which brings up to 72% decrease in training time.
翻译:联邦学习作为一种无需传输终端用户数据的分布式机器学习范式,有效避免了隐私泄露。参与联邦学习的设备通常受限于带宽,且无线网络中上行链路远慢于下行链路,导致严重的上行通信瓶颈。缓解该问题的一个重要方向是联邦丢弃,即丢弃局部模型的部分权重。然而,现有联邦丢弃研究集中于随机或有序丢弃,缺乏理论支撑,导致性能无法保证。本文提出基于贝叶斯推理自适应丢弃的联邦学习(FedBIAD),将局部模型的权重行视为概率分布,并根据与局部训练损失趋势相关的重要性指标自适应地丢弃部分权重行。通过应用FedBIAD,每个客户端可自适应选择具有精确近似的高质量丢弃模式,仅传输未丢弃权重行的参数,以在降低上行成本的同时提升精度。理论分析表明,FedBIAD的平均泛化误差收敛速率达到极小极大最优(仅差平方对数因子)。在图像分类和下一词预测任务上的大量实验显示,与现有方法相比,FedBIAD可实现2倍上行链路缩减,甚至在非独立同分布(non-IID)数据上准确率提升高达2.41%,同时训练时间降低达72%。