R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML), where components of large ML models are outsourced to remote servers. A significant challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming that could jeopardize the learning process. This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding. In this paper, rigorous insights are provided into the influence of jamming LLM word embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE). Based on this analysis, a physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks. R-SFLLM leverages wireless sensing data to gather information on the jamming directions-of-arrival (DoAs) for the purpose of devising a novel, sensing-assisted anti-jamming strategy while jointly optimizing beamforming, user scheduling, and resource allocation. Extensive experiments using BERT and RoBERTa models demonstrate R-SFLLM's effectiveness, achieving close-to-baseline performance across various natural language processing (NLP) tasks and datasets. The proposed methodology further introduces an adversarial training component, where controlled noise exposure significantly enhances the LLM's resilience to perturbed parameters during training. The results show that more noise-sensitive models, such as RoBERTa, benefit from this feature, especially when resource allocation is unfair. It is also shown that worst-case jamming in particular translates into worst-case model outcomes, thereby necessitating the need for jamming-resilient SFL protocols.

翻译：分割联邦学习（SFL）是分布式机器学习（ML）中一种计算高效的范式，它将大型ML模型的组件外包给远程服务器。SFL面临的一个重大挑战，尤其是在无线信道上部署时，是传输的模型参数容易受到对抗性干扰，这可能危及学习过程。这对于大型语言模型（LLM）中的词嵌入参数尤为突出，这些参数对语言理解至关重要。本文通过推导ML训练损失散度的表达式，并证明其上界为均方误差（MSE），从而对SFL中LLM词嵌入受干扰的影响提供了严谨的见解。基于此分析，我们为无线网络上的LLM弹性SFL（R-SFLLM）开发了一个物理层框架。R-SFLLM利用无线传感数据收集干扰到达方向（DoA）信息，旨在设计一种新颖的传感辅助抗干扰策略，同时联合优化波束成形、用户调度和资源分配。使用BERT和RoBERTa模型的大量实验证明了R-SFLLM的有效性，在各种自然语言处理（NLP）任务和数据集上实现了接近基线的性能。所提出的方法进一步引入了对抗训练组件，其中受控的噪声暴露显著增强了LLM在训练期间对参数扰动的弹性。结果表明，对噪声更敏感的模型（如RoBERTa）从此特性中受益，尤其是在资源分配不公的情况下。研究还表明，最坏情况的干扰尤其会导致最坏的模型结果，因此需要抗干扰的SFL协议。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日