Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies

The large models, as predicted by scaling raw forecasts, have made groundbreaking progress in many fields, particularly in natural language generation tasks, where they have approached or even surpassed human levels. However, the unprecedented scale of their parameters brings significant computational and storage costs. These large models require substantial computational resources and GPU memory to operate. When adapting large models to specific downstream tasks, their massive parameter scale poses a significant challenge in fine-tuning on hardware platforms with limited computational power and GPU memory. To address this issue, Parameter-Efficient Fine-Tuning (PEFT) offers a practical solution by efficiently adjusting the parameters of large pre-trained models to suit various downstream tasks. Specifically, PEFT adjusts the parameters of pre-trained large models to adapt to specific tasks or domains, minimizing the introduction of additional parameters and the computational resources required. This review mainly introduces the preliminary knowledge of PEFT, the core ideas and principles of various PEFT algorithms, the applications of PEFT, and potential future research directions. By reading this review, we believe that interested parties can quickly grasp the PEFT methodology, thereby accelerating its development and innovation.

翻译：正如扩展定律所预测的那样，大规模模型已在诸多领域取得突破性进展，尤其在自然语言生成任务中，其性能已接近甚至超越人类水平。然而，其前所未有的参数量级带来了巨大的计算与存储开销。这些大规模模型需要消耗大量计算资源和GPU内存才能运行。当将大规模模型适配至特定下游任务时，其庞大的参数量对计算能力与GPU内存有限的硬件平台上的微调工作构成了重大挑战。为解决这一问题，参数高效微调（PEFT）通过高效调整大规模预训练模型的参数以适应不同下游任务，提供了一种实用的解决方案。具体而言，PEFT通过调整预训练大规模模型的参数来适应特定任务或领域，同时最大限度地减少额外引入的参数和所需计算资源。本文综述主要介绍了PEFT的基础知识、各类PEFT算法的核心思想与原理、PEFT的应用场景以及未来潜在的研究方向。通过阅读本综述，我们相信相关研究者能够快速掌握PEFT方法体系，从而推动其发展与创新。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日