In the resource-constrained IoT-edge environment, Split Federated (SplitFed) learning is implemented to enhance training efficiency. This method involves each IoT device dividing its full DNN model at a designated layer into a device-side model and a server-side model, then offloading the latter to the edge server. However, existing research overlooks four critical issues as follows: (1) the heterogeneity of IoT devices' resource capacities and the sizes of their local data samples impact training efficiency; (2) the influence of the edge server's computation and network resource allocation on training efficiency; (3) the data leakage risk associated with the offloaded server-side sub-model; (4) the privacy drawbacks of current centralized algorithms. Consequently, proactively identifying the optimal cut layer and server resource requirements for each IoT device to minimize training latency while adhering to data leakage risk rate constraint remains a challenging issue. To address these problems, this paper first formulates the latency and data leakage risk of training DNN models using Split Federated learning. Next, we frame the Split Federated learning problem as a mixed-integer nonlinear programming challenge. To tackle this, we propose a decentralized Proactive Model Offloading and Resource Allocation (DP-MORA) scheme, empowering each IoT device to determine its cut layer and resource requirements based on its local multidimensional training configuration, without knowledge of other devices' configurations. Extensive experiments on two real-world datasets demonstrate that the DP-MORA scheme effectively reduces DNN model training latency, enhances training efficiency, and complies with data leakage risk constraints compared to several baseline algorithms across various experimental settings.
翻译:在资源受限的物联网-边缘计算环境中,采用拆分联邦(SplitFed)学习方法以提升训练效率。该方法要求每个物联网设备在指定层将其完整深度神经网络模型划分为设备侧模型与服务器侧模型,并将后者卸载至边缘服务器。然而,现有研究忽略了以下四个关键问题:(1)物联网设备资源容量及本地数据样本大小的异构性对训练效率的影响;(2)边缘服务器计算与网络资源分配对训练效率的作用;(3)卸载的服务器侧子模型带来的数据泄露风险;(4)当前集中式算法的隐私缺陷。因此,在满足数据泄露风险率约束的前提下,如何主动为每个物联网设备确定最优切分层及服务器资源需求以最小化训练延迟,仍是一个具有挑战性的问题。针对上述问题,本文首先对采用拆分联邦学习的深度神经网络训练延迟及数据泄露风险进行建模,进而将拆分联邦学习问题表述为混合整数非线性规划问题。为解决该问题,我们提出一种去中心化主动模型卸载与资源分配(DP-MORA)方案,该方案使每个物联网设备能基于其本地多维训练配置独立确定切分层与资源需求,无需获知其他设备的配置信息。在两个真实数据集上的大量实验表明,DP-MORA方案相较于多种基线算法,能够在不同实验设置下有效降低深度神经网络模型训练延迟、提升训练效率,并满足数据泄露风险约束。