SplitFed Learning (SFL) combines federated learning and split learning to enable collaborative training across distributed edge devices; however, it faces significant challenges in heterogeneous environments with diverse computational and communication capabilities. This paper proposes \textit{SuperSFL}, a federated split learning framework that leverages a weight-sharing super-network to dynamically generate resource-aware client-specific subnetworks, effectively mitigating device heterogeneity. SuperSFL introduces Three-Phase Gradient Fusion (TPGF), an optimization mechanism that coordinates local updates, server-side computation, and gradient fusion to accelerate convergence. In addition, a fault-tolerant client-side classifier and collaborative client--server aggregation enable uninterrupted training under intermittent communication failures. Experimental results on CIFAR-10 and CIFAR-100 with up to 100 heterogeneous clients show that SuperSFL converges $2$--$5\times$ faster in terms of communication rounds than baseline SFL while achieving higher accuracy, resulting in up to $20\times$ lower total communication cost and $13\times$ shorter training time. SuperSFL also demonstrates improved energy efficiency compared to baseline methods, making it a practical solution for federated learning in heterogeneous edge environments.
翻译:分割联邦学习(SplitFed Learning, SFL)融合了联邦学习与分割学习,使得分布式边缘设备能够进行协同训练;然而,在计算与通信能力差异化的异构环境中,该方法面临严峻挑战。本文提出\textit{SuperSFL}框架——一种利用权重共享超网络动态生成适配资源约束的客户端子网络,从而有效缓解设备异构性的联邦分割学习方法。SuperSFL引入三阶段梯度融合(Three-Phase Gradient Fusion, TPGF)优化机制,通过协调本地更新、服务端计算与梯度融合加速收敛。此外,容错的客户端分类器与协同的客户端-服务端聚合机制确保了在间歇性通信故障下训练的不中断性。在包含多达100个异构客户端的CIFAR-10与CIFAR-100数据集上的实验结果表明,SuperSFL相比基准SFL方法在通信轮次上收敛速度提升$2$--$5$倍,同时实现更高精度;总通信成本降低达$20$倍,训练时间缩短达$13$倍。相较于基线方法,SuperSFL还展现出更优的能效性,使其成为异构边缘环境中联邦学习的实用解决方案。