Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for resource-constrained organizations. While cloud platforms could provide the needed resources, data privacy concerns make sharing sensitive information with third parties risky. A promising solution is split learning for LLM fine-tuning, which divides the model between clients and a server, allowing collaborative and secure training through exchanged intermediate data, thus enabling resource-constrained participants to adapt LLMs safely. % In light of this, a growing body of literature has emerged to advance this paradigm, introducing varied model methods, system optimizations, and privacy defense-attack techniques for split learning. To bring clarity and direction to the field, a comprehensive survey is needed to classify, compare, and critique these diverse approaches. This paper fills the gap by presenting the first extensive survey dedicated to split learning for LLM fine-tuning. We propose a unified, fine-grained training pipeline to pinpoint key operational components and conduct a systematic review of state-of-the-art work across three core dimensions: model-level optimization, system-level efficiency, and privacy preservation. Through this structured taxonomy, we establish a foundation for advancing scalable, robust, and secure collaborative LLM adaptation.
翻译:微调技术使大语言模型(LLMs)能够应用于特定领域,但高昂的计算成本常使资源受限的组织望而却步。虽然云平台可提供所需算力,但数据隐私问题使得将敏感信息共享给第三方存在风险。一种有前景的解决方案是面向LLM微调的分割学习(Split Learning),该方法将模型划分为客户端与服务器端两部分,通过交换中间数据实现协作式安全训练,从而使资源受限的参与者能够安全适配LLMs。基于此,越来越多的研究文献涌现以推动这一范式发展,涵盖了多样化的模型方法、系统优化策略及面向分割学习的隐私攻防技术。为厘清该领域的发展脉络并提供指导方向,亟需一篇综合性综述来分类、比较与评述这些多元方法。本文填补了这一空白,首次对面向LLM微调的分割学习进行了全面综述。我们提出了一种统一的细粒度训练流程以定位关键操作组件,并沿三个核心维度——模型级优化、系统级效率与隐私保护——对前沿研究进行了系统性梳理。通过这一结构化分类体系,我们为推进可扩展、鲁棒且安全的协作式LLM适配奠定了方法论基础。