Federated Split Learning (FSL) is a promising distributed learning paradigm in practice, which gathers the strengths of both Federated Learning (FL) and Split Learning (SL) paradigms, to ensure model privacy while diminishing the resource overhead of each client, especially on large transformer models in a resource-constrained environment, e.g., Internet of Things (IoT). However, almost all works merely investigate the performance with simple neural network models in FSL. Despite the minor efforts focusing on incorporating Vision Transformers (ViT) as model architectures, they train ViT from scratch, thereby leading to enormous training overhead in each device with limited resources. Therefore, in this paper, we harness Pre-trained Image Transformers (PITs) as the initial model, coined FedV, to accelerate the training process and improve model robustness. Furthermore, we propose FedVZ to hinder the gradient inversion attack, especially having the capability compatible with black-box scenarios, where the gradient information is unavailable. Concretely, FedVZ approximates the server gradient by utilizing a zeroth-order (ZO) optimization, which replaces the backward propagation with just one forward process. Empirically, we are the first to provide a systematic evaluation of FSL methods with PITs in real-world datasets, different partial device participations, and heterogeneous data splits. Our experiments verify the effectiveness of our algorithms.
翻译:联邦分割学习(FSL)是一种在实践中具有前景的分布式学习范式,它融合了联邦学习(FL)和分割学习(SL)两大范式的优势,在保障模型隐私的同时降低每个客户端的资源开销,尤其适用于资源受限环境(如物联网)中的大型Transformer模型。然而,几乎所有现有工作仅研究FSL中简单神经网络模型的性能。尽管有少数工作尝试将视觉Transformer(ViT)作为模型架构,但它们从头开始训练ViT,导致资源有限的每个设备产生巨大的训练开销。因此,本文利用预训练图像Transformer(PIT)作为初始模型(命名为FedV),以加速训练过程并提升模型鲁棒性。此外,我们提出FedVZ来防御梯度反演攻击,尤其具备兼容黑盒场景的能力(其中梯度信息不可用)。具体而言,FedVZ利用零阶(ZO)优化近似服务器梯度,用单次前向传播替代反向传播。在实证方面,我们首次在真实数据集、不同部分设备参与以及异构数据拆分场景下,对基于PIT的FSL方法进行了系统评估。实验验证了我们算法的有效性。