To effectively manage and utilize massive distributed data at the network edge, Federated Learning (FL) has emerged as a promising edge computing paradigm across data silos. However, FL still faces two challenges: system heterogeneity (i.e., the diversity of hardware resources across edge devices) and statistical heterogeneity (i.e., non-IID data). Although sparsification can extract diverse submodels for diverse clients, most sparse FL works either simply assign submodels with artificially-given rigid rules or prune partial parameters using heuristic strategies, resulting in inflexible sparsification and poor performance. In this work, we propose Learnable Personalized Sparsification for heterogeneous Federated learning (FedLPS), which achieves the learnable customization of heterogeneous sparse models with importance-associated patterns and adaptive ratios to simultaneously tackle system and statistical heterogeneity. Specifically, FedLPS learns the importance of model units on local data representation and further derives an importance-based sparse pattern with minimal heuristics to accurately extract personalized data features in non-IID settings. Furthermore, Prompt Upper Confidence Bound Variance (P-UCBV) is designed to adaptively determine sparse ratios by learning the superimposed effect of diverse device capabilities and non-IID data, aiming at resource self-adaptation with promising accuracy. Extensive experiments show that FedLPS outperforms status quo approaches in accuracy and training costs, which improves accuracy by 1.28%-59.34% while reducing running time by more than 68.80%.
翻译:为有效管理和利用网络边缘的海量分布式数据,联邦学习作为一种跨越数据孤岛的边缘计算范式应运而生。然而,联邦学习仍面临两大挑战:系统异构性(即边缘设备间硬件资源的多样性)和统计异构性(即非独立同分布数据)。尽管稀疏化技术可为不同客户端提取多样化子模型,但现有稀疏联邦学习方法大多仅通过人为设定的固定规则分配子模型,或采用启发式策略剪枝部分参数,导致稀疏化过程僵化且性能欠佳。本文提出面向异构联邦学习的可学习个性化稀疏化方法,该方法通过重要性关联模式与自适应稀疏率实现异构稀疏模型的可学习定制,以同步应对系统与统计异构性问题。具体而言,FedLPS通过分析模型单元对本地数据表征的重要性,进一步推导出基于重要性的稀疏模式,在最小化启发式干预的前提下精准提取非独立同分布场景下的个性化数据特征。此外,本文设计提示上置信界方差算法,通过学习设备能力差异与非独立同分布数据的叠加效应自适应确定稀疏率,实现兼顾资源自适应与模型精度的优化目标。大量实验表明,FedLPS在精度与训练成本方面均优于现有方法,在降低68.80%以上运行时间的同时,将模型精度提升1.28%-59.34%。