Split Learning (SL) is one promising variant of Federated Learning (FL), where the AI model is split and trained at the clients and the server collaboratively. By offloading the computation-intensive portions to the server, SL enables efficient model training on resource-constrained clients. Despite its booming applications, SL still lacks rigorous convergence analysis on non-IID data, which is critical for hyperparameter selection. In this paper, we first prove that SL exhibits an $\mathcal{O}(1/\sqrt{R})$ convergence rate for non-convex objectives on non-IID data, where $R$ is the number of total training rounds. The derived convergence results can facilitate understanding the effect of some crucial factors in SL (e.g., data heterogeneity and synchronization interval). Furthermore, comparing with the convergence result of FL, we show that the guarantee of SL is worse than FL in terms of training rounds on non-IID data. The experimental results verify our theory. More findings on the comparison between FL and SL in cross-device settings are also reported.
翻译:分割学习(Split Learning, SL)是联邦学习(Federated Learning, FL)的一种有前景的变体,其中AI模型被拆分并由客户端和服务器协同训练。通过将计算密集型部分卸载到服务器,SL能够在资源受限的客户端上实现高效的模型训练。尽管其应用日益广泛,但SL仍缺乏在非独立同分布(non-IID)数据上的严格收敛分析,而这对于超参数选择至关重要。本文首先证明了SL在非IID数据上对于非凸目标函数具有$\mathcal{O}(1/\sqrt{R})$的收敛率,其中$R$是总训练轮数。推导的收敛结果有助于理解SL中一些关键因素(如数据异质性和同步间隔)的影响。此外,与FL的收敛结果相比,我们表明在非IID数据上,SL在训练轮数方面的保证劣于FL。实验结果验证了我们的理论。我们还报告了FL与SL在跨设备设置下比较的更多发现。