Large-scale neural networks possess considerable expressive power. They are well-suited for complex learning tasks in industrial applications. However, large-scale models pose significant challenges for training under the current Federated Learning (FL) paradigm. Existing approaches for efficient FL training often leverage model parameter dropout. However, manipulating individual model parameters is not only inefficient in meaningfully reducing the communication overhead when training large-scale FL models, but may also be detrimental to the scaling efforts and model performance as shown by recent research. To address these issues, we propose the Federated Opportunistic Block Dropout (FedOBD) approach. The key novelty is that it decomposes large-scale models into semantic blocks so that FL participants can opportunistically upload quantized blocks, which are deemed to be significant towards training the model, to the FL server for aggregation. Extensive experiments evaluating FedOBD against four state-of-the-art approaches based on multiple real-world datasets show that it reduces the overall communication overhead by more than 88% compared to the best performing baseline approach, while achieving the highest test accuracy. To the best of our knowledge, FedOBD is the first approach to perform dropout on FL models at the block level rather than at the individual parameter level.
翻译:摘要:大规模神经网络具有强大的表达能力,非常适合工业应用中复杂的机器学习任务。然而,在当前联邦学习范式下训练大规模模型面临重大挑战。现有联邦学习高效训练方法通常采用模型参数丢弃策略,但最新研究表明,操作单个模型参数不仅难以有效降低训练大规模联邦模型时的通信开销,还可能阻碍模型扩展能力并损害性能。为解决这些问题,我们提出联邦机遇式块丢弃方法(FedOBD)。其关键创新在于将大规模模型分解为语义块,使得联邦学习参与者能够机会性地上传经量化的关键语义块至服务器进行聚合——这些语义块被认为对模型训练具有重要贡献。基于多个真实数据集,我们通过与四种最先进方法对比的广泛实验表明,FedOBD在实现最高测试精度的同时,通信总开销相比最佳基线方法降低超过88%。据我们所知,FedOBD是首个在联邦学习模型中实现块级(而非参数级)丢弃的方法。