Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning - 专知论文

会员服务 ·

0

参数高效 · 微调 · 指南 · 方法比较 · 语言模型 ·

2023 年 3 月 28 日

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

翻译：缩小规模以扩大规模：参数高效微调指南

Vladislav Lialin,Vijeta Deshpande,Anna Rumshisky

This paper presents a systematic overview and comparison of parameter-efficient fine-tuning methods covering over 40 papers published between February 2019 and February 2023. These methods aim to resolve the infeasibility and impracticality of fine-tuning large language models by only training a small set of parameters. We provide a taxonomy that covers a broad range of methods and present a detailed method comparison with a specific focus on real-life efficiency and fine-tuning multibillion-scale language models.

翻译：本文系统概述并比较了参数高效微调方法，涵盖2019年2月至2023年2月间发表的40余篇论文。这些方法旨在通过仅训练少量参数来解决大型语言模型微调不可行且不切实际的问题。我们提出了一种涵盖多种方法的分类体系，并进行了详细的方法比较，特别关注实际效率及对数十亿规模语言模型的微调。

0

相关内容

参数高效

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

Swin Transformer重磅升级！Swin V2：向更大容量、更高分辨率的更大模型迈进

Swin Transformer重磅升级！Swin V2：向更大容量、更高分辨率的更大模型迈进

专知会员服务

28+阅读 · 2021年11月20日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【ICML2020-伯克利】反直觉！大模型重压缩提升Transformer的训练和推理效率，47页ppt

【ICML2020-伯克利】反直觉！大模型重压缩提升Transformer的训练和推理效率，47页ppt

专知会员服务

70+阅读 · 2020年7月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【KDD2019教程】从浅层到深层的语言表达:预训练、微调，等等，From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond

【KDD2019教程】从浅层到深层的语言表达:预训练、微调，等等，From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond

专知会员服务

16+阅读 · 2019年11月4日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

PaperWeekly

0+阅读 · 2022年11月9日

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

机器之心

0+阅读 · 2022年11月7日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Ladder Side-Tuning：预训练模型的“过墙梯”

Ladder Side-Tuning：预训练模型的“过墙梯”

PaperWeekly

0+阅读 · 2022年6月24日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

细胞外基质蛋白reelin对多发性骨髓瘤增殖和分化的影响及其机制

国家自然科学基金

0+阅读 · 2016年12月31日

超细Si颗粒镶嵌纳米多孔碳纤维的制备及其在锂离子电池负极中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

嵌入式密码芯片实现抗旁路攻击安全性评估技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

有限阿贝尔群上的零和问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

CuInGaSe2太阳能电池界面结构、界面态及其钝化

国家自然科学基金

0+阅读 · 2012年12月31日

稀疏网格谱方法及其在电子结构薛定谔方程上的应用

国家自然科学基金

0+阅读 · 2012年12月31日

超大规模集成电路布局的ell-1模优化模型及其算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

有机铁电材料的纳米压印加工及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属氧化物有序纳米结构阵列在染料电池中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于湍流非平衡输运特性改进湍流模型对角区分离模拟能力的研究

国家自然科学基金

0+阅读 · 2009年12月31日

G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks

Arxiv

0+阅读 · 2023年5月17日

Pruning Pre-trained Language Models Without Fine-Tuning

Arxiv

0+阅读 · 2023年5月16日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

23+阅读 · 2021年9月23日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

24+阅读 · 2021年8月12日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

最新内容

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

4+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

5+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

5+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

2+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

3+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

11+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

7+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

6+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

13+阅读 · 7月16日

美陆军任务式指挥人工智能解决方案

美陆军任务式指挥人工智能解决方案

专知会员服务

13+阅读 · 7月16日

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

专知会员服务

9+阅读 · 7月16日

综述 | 现代智能体自我改进，从模型更新到脚手架演化

综述 | 现代智能体自我改进，从模型更新到脚手架演化

专知会员服务

16+阅读 · 7月16日

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

专知会员服务

13+阅读 · 7月15日

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

Swin Transformer重磅升级！Swin V2：向更大容量、更高分辨率的更大模型迈进

Swin Transformer重磅升级！Swin V2：向更大容量、更高分辨率的更大模型迈进

专知会员服务

28+阅读 · 2021年11月20日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【ICML2020-伯克利】反直觉！大模型重压缩提升Transformer的训练和推理效率，47页ppt

【ICML2020-伯克利】反直觉！大模型重压缩提升Transformer的训练和推理效率，47页ppt

专知会员服务

70+阅读 · 2020年7月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【KDD2019教程】从浅层到深层的语言表达:预训练、微调，等等，From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond

【KDD2019教程】从浅层到深层的语言表达:预训练、微调，等等，From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond

专知会员服务

16+阅读 · 2019年11月4日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《边缘端实时无线感知赋能现场多机器人部署》200页

人工智能赋能战场情报：提速决策进程

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

战力倍增器：自主武器系统与乌克兰及加沙冲突

相关资讯

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

PaperWeekly

0+阅读 · 2022年11月9日

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

机器之心

0+阅读 · 2022年11月7日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Ladder Side-Tuning：预训练模型的“过墙梯”

Ladder Side-Tuning：预训练模型的“过墙梯”

PaperWeekly

0+阅读 · 2022年6月24日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks

Arxiv

0+阅读 · 2023年5月17日

Pruning Pre-trained Language Models Without Fine-Tuning

Arxiv

0+阅读 · 2023年5月16日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

23+阅读 · 2021年9月23日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

24+阅读 · 2021年8月12日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

相关基金

细胞外基质蛋白reelin对多发性骨髓瘤增殖和分化的影响及其机制

国家自然科学基金

0+阅读 · 2016年12月31日

超细Si颗粒镶嵌纳米多孔碳纤维的制备及其在锂离子电池负极中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

嵌入式密码芯片实现抗旁路攻击安全性评估技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

有限阿贝尔群上的零和问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

CuInGaSe2太阳能电池界面结构、界面态及其钝化

国家自然科学基金

0+阅读 · 2012年12月31日

稀疏网格谱方法及其在电子结构薛定谔方程上的应用

国家自然科学基金

0+阅读 · 2012年12月31日

超大规模集成电路布局的ell-1模优化模型及其算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

有机铁电材料的纳米压印加工及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属氧化物有序纳米结构阵列在染料电池中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于湍流非平衡输运特性改进湍流模型对角区分离模拟能力的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员