While semi-asynchronous federated learning (SAFL) combines the efficiency of synchronous training with the flexibility of asynchronous updates, it inherently suffers from participation bias, which is further exacerbated by non-IID data distributions. More importantly, hierarchical architecture shifts participation from individual clients to client groups, thereby further intensifying this issue. Despite notable advancements in SAFL research, most existing works still focus on conventional cloud-end architectures while largely overlooking the critical impact of non-IID data on scheduling across the cloud-edge-client hierarchy. To tackle these challenges, we propose FedCure, an innovative semi-asynchronous Federated learning framework that leverages coalition construction and participation-aware scheduling to mitigate participation bias with non-IID data. Specifically, FedCure operates through three key rules: (1) a preference rule that optimizes coalition formation by maximizing collective benefits and establishing theoretically stable partitions to reduce non-IID-induced performance degradation; (2) a scheduling rule that integrates the virtual queue technique with Bayesian-estimated coalition dynamics, mitigating efficiency loss while ensuring mean rate stability; and (3) a resource allocation rule that enhances computational efficiency by optimizing client CPU frequencies based on estimated coalition dynamics while satisfying delay requirements. Comprehensive experiments on four real-world datasets demonstrate that FedCure improves accuracy by up to 5.1x compared with four state-of-the-art baselines, while significantly enhancing efficiency with the lowest coefficient of variation 0.0223 for per-round latency and maintaining long-term balance across diverse scenarios.


翻译:半异步联邦学习(SAFL)虽然结合了同步训练的效率与异步更新的灵活性,但其固有地存在参与偏差问题,且在非独立同分布数据分布下进一步加剧。更重要的是,分层架构将参与主体从个体客户端转移至客户端群组,从而进一步放大了该问题。尽管SAFL研究已取得显著进展,现有工作大多仍聚焦于传统的云-端架构,而普遍忽视了非独立同分布数据对云-边-客户端分层调度机制的关键影响。为应对这些挑战,本文提出FedCure——一种创新的半异步联邦学习框架,通过联盟构建与参与感知调度来缓解非独立同分布数据下的参与偏差。具体而言,FedCure通过三项核心规则运作:(1)偏好规则:通过最大化集体收益并建立理论稳定的分区来优化联盟构建,从而减少非独立同分布数据导致的性能衰减;(2)调度规则:将虚拟队列技术与贝叶斯估计的联盟动态相结合,在保证平均速率稳定的同时缓解效率损失;(3)资源分配规则:基于估计的联盟动态优化客户端CPU频率以提升计算效率,同时满足延迟要求。在四个真实数据集上的综合实验表明,相较于四种先进基线方法,FedCure将准确率最高提升5.1倍,同时以每轮延迟最低变异系数0.0223显著提升效率,并在多样化场景中保持长期均衡。

0
下载
关闭预览

相关内容

Top
微信扫码咨询专知VIP会员