This paper proposes a novel communication-efficient split learning (SL) framework, named SplitFC, which reduces the communication overhead required for transmitting intermediate feature and gradient vectors during the SL training process. The key idea of SplitFC is to leverage different dispersion degrees exhibited in the columns of the matrices. SplitFC incorporates two compression strategies: (i) adaptive feature-wise dropout and (ii) adaptive feature-wise quantization. In the first strategy, the intermediate feature vectors are dropped with adaptive dropout probabilities determined based on the standard deviation of these vectors. Then, by the chain rule, the intermediate gradient vectors associated with the dropped feature vectors are also dropped. In the second strategy, the non-dropped intermediate feature and gradient vectors are quantized using adaptive quantization levels determined based on the ranges of the vectors. To minimize the quantization error, the optimal quantization levels of this strategy are derived in a closed-form expression. Simulation results on the MNIST, CIFAR-10, and CelebA datasets demonstrate that SplitFC provides more than a 5.6% increase in classification accuracy compared to state-of-the-art SL frameworks, while they require 320 times less communication overhead compared to the vanilla SL framework without compression.
翻译:本文提出了一种名为SplitFC的新型通信高效分裂学习框架,该框架通过减少分裂学习训练过程中中间特征向量和梯度向量的传输开销来降低通信负载。SplitFC的核心思想在于利用矩阵列中呈现的不同离散程度。该框架融合了两种压缩策略:(i) 自适应特征级随机丢弃和(ii) 自适应特征级量化。在第一种策略中,基于中间特征向量的标准差确定自适应丢弃概率,据此丢弃这些向量;随后根据链式法则,与丢弃特征向量相关联的中间梯度向量也被丢弃。在第二种策略中,对未被丢弃的中间特征向量和梯度向量采用基于向量范围确定的自适应量化级别进行量化。为最小化量化误差,本文推导出了该策略最优量化级别的封闭解析表达式。在MNIST、CIFAR-10和CelebA数据集上的仿真结果表明,相比最先进的分裂学习框架,SplitFC的分类准确率提升了超过5.6%,同时其通信开销仅为无压缩的原始分裂学习框架的1/320。