Due to the sensitivity of data, Federated Learning (FL) is employed to enable distributed machine learning while safeguarding data privacy and accommodating the requirements of various devices. However, in the context of semi-decentralized FL, clients' communication and training states are dynamic. This variability arises from local training fluctuations, heterogeneous data distributions, and intermittent client participation. Most existing studies primarily focus on stable client states, neglecting the dynamic challenges inherent in real-world scenarios. To tackle this issue, we propose a TRust-Aware clIent scheduLing mechanism called TRAIL, which assesses client states and contributions, enhancing model training efficiency through selective client participation. We focus on a semi-decentralized FL framework where edge servers and clients train a shared global model using unreliable intra-cluster model aggregation and inter-cluster model consensus. First, we propose an adaptive hidden semi-Markov model to estimate clients' communication states and contributions. Next, we address a client-server association optimization problem to minimize global training loss. Using convergence analysis, we propose a greedy client scheduling algorithm. Finally, our experiments conducted on real-world datasets demonstrate that TRAIL outperforms state-of-the-art baselines, achieving an improvement of 8.7% in test accuracy and a reduction of 15.3% in training loss.
翻译:鉴于数据的敏感性,联邦学习被用于实现分布式机器学习,同时保护数据隐私并适应各类设备的需求。然而,在半去中心化联邦学习场景中,客户端的通信与训练状态是动态变化的。这种可变性源于局部训练的波动、异构的数据分布以及客户端的间歇性参与。现有研究大多聚焦于稳定的客户端状态,忽视了现实场景中固有的动态性挑战。为解决此问题,我们提出一种名为TRAIL的信任感知客户端调度机制,该机制通过评估客户端状态与贡献,并选择性地调度客户端参与,以提升模型训练效率。我们关注一种半去中心化联邦学习框架,其中边缘服务器与客户端通过不可靠的集群内模型聚合与集群间模型共识来训练共享的全局模型。首先,我们提出一种自适应隐半马尔可夫模型来估计客户端的通信状态与贡献。接着,我们构建了一个客户端-服务器关联优化问题,以最小化全局训练损失。基于收敛性分析,我们提出了一种贪心客户端调度算法。最后,我们在真实数据集上进行的实验表明,TRAIL优于现有先进基线方法,在测试准确率上提升了8.7%,在训练损失上降低了15.3%。