Federated Learning (FL) enables distributed Artificial Intelligence (AI) across cloud-edge environments by allowing collaborative model training without centralizing data. In cross-device deployments, FL systems face strict communication and participation constraints, as well as strong non-independent and identically distributed (non-IID) data that degrades convergence and model quality. Since only a subset of devices (a.k.a clients) can participate per training round, intelligent client selection becomes a key systems challenge. This paper proposes FedLECC (Federated Learning with Enhanced Cluster Choice), a lightweight, cluster-aware, and loss-guided client selection strategy for cross-device FL. FedLECC groups clients by label-distribution similarity and prioritizes clusters and clients with higher local loss, enabling the selection of a small yet informative and diverse set of clients. Experimental results under severe label skew show that FedLECC improves test accuracy by up to 12%, while reducing communication rounds by approximately 22% and overall communication overhead by up to 50% compared to strong baselines. These results demonstrate that informed client selection improves the efficiency and scalability of FL workloads in cloud-edge systems.
翻译:联邦学习(Federated Learning, FL)通过允许在无需集中数据的情况下进行协作式模型训练,实现了云边环境下的分布式人工智能(AI)。在跨设备部署中,FL系统面临严格的通信与参与约束,以及严重的非独立同分布(non-IID)数据问题,后者会降低收敛速度与模型质量。由于每轮训练只能选取部分设备(即客户端)参与,智能化的客户端选择成为一个关键的系统挑战。本文提出FedLECC(基于增强聚类选择的联邦学习),一种面向跨设备FL的轻量级、聚类感知且损失引导的客户端选择策略。FedLECC通过标签分布相似性对客户端进行分组,并优先选择局部损失较高的聚类与客户端,从而能够选取一小部分信息丰富且多样化的客户端。在严重标签偏斜下的实验结果表明,与强基线方法相比,FedLECC可将测试精度提升高达12%,同时减少约22%的通信轮次,并降低总体通信开销达50%。这些结果证明,基于信息的客户端选择能够提升云边系统中FL工作负载的效率与可扩展性。