Federated Learning (FL) enables a distributed client-server architecture where multiple clients collaboratively train a global Machine Learning (ML) model without sharing sensitive local data. However, FL often results in lower accuracy than traditional ML algorithms due to statistical heterogeneity across clients. Prior works attempt to address this by using model updates, such as loss and bias, from client models to select participants that can improve the global model's accuracy. However, these updates neither accurately represent a client's heterogeneity nor are their selection methods deterministic. We mitigate these limitations by introducing Terraform, a novel client selection methodology that uses gradient updates and a deterministic selection algorithm to select heterogeneous clients for retraining. This bi-pronged approach allows Terraform to achieve up to 47 percent higher accuracy over prior works. We further demonstrate its efficiency through comprehensive ablation studies and training time analyses, providing strong justification for the robustness of Terraform.
翻译:联邦学习(FL)支持分布式客户端-服务器架构,允许多个客户端在不共享敏感本地数据的情况下协作训练全局机器学习(ML)模型。然而,由于客户端间的统计异质性,FL通常导致其准确率低于传统ML算法。先前研究尝试通过利用客户端模型的损失和偏差等模型更新信息来选择能够提升全局模型准确率的参与客户端,但这些更新既不能准确表征客户端的异质性,其选择方法也非确定性。为缓解这些局限性,本文提出Terraform——一种新颖的客户端选择方法,该方法通过梯度更新与确定性选择算法来筛选异质客户端进行重训练。这种双管齐下的策略使Terraform相比先前研究最高可获得47%的准确率提升。我们通过系统的消融实验与训练时间分析进一步验证了其效率,为Terraform的鲁棒性提供了有力佐证。