This paper introduces a novel framework designed to achieve a high compression ratio in Split Learning (SL) scenarios where resource-constrained devices are involved in large-scale model training. Our investigations demonstrate that compressing feature maps within SL leads to biased gradients that can negatively impact the convergence rates and diminish the generalization capabilities of the resulting models. Our theoretical analysis provides insights into how compression errors critically hinder SL performance, which previous methodologies underestimate. To address these challenges, we employ a narrow bit-width encoded mask to compensate for the sparsification error without increasing the order of time complexity. Supported by rigorous theoretical analysis, our framework significantly reduces compression errors and accelerates the convergence. Extensive experiments also verify that our method outperforms existing solutions regarding training efficiency and communication complexity.
翻译:本文提出了一种新颖的框架,旨在为涉及资源受限设备的大规模模型训练的分割学习场景实现高压缩比。我们的研究表明,在分割学习中压缩特征图会导致梯度偏差,这可能对模型的收敛速度产生负面影响并削弱其泛化能力。我们的理论分析揭示了压缩误差如何严重阻碍分割学习性能,而先前的方法低估了这一点。为应对这些挑战,我们采用一种窄位宽编码掩码来补偿稀疏化误差,同时不增加时间复杂度阶数。在严格理论分析的支持下,我们的框架显著减少了压缩误差并加速了收敛。大量实验也验证了我们的方法在训练效率和通信复杂度方面优于现有解决方案。