PTP: Boosting Stability and Performance of Prompt Tuning with Perturbation-Based Regularizer - 专知论文

会员服务 ·

0

tuning · Prompt · Performer · Boosting（一种模型训练加速方式） · 正则化项 ·

2023 年 5 月 3 日

PTP: Boosting Stability and Performance of Prompt Tuning with Perturbation-Based Regularizer

翻译：PTP：基于扰动正则化器的提示调优稳定性与性能提升

Lichang Chen,Heng Huang,Minhao Cheng

from arxiv, 13 pages;

Recent studies show that prompt tuning can better leverage the power of large language models than fine-tuning on downstream natural language understanding tasks. However, the existing prompt tuning methods have training instability issues, as the variance of scores under different random seeds is quite large. To address this critical problem, we first investigate and find that the loss landscape of vanilla prompt tuning is precipitous when it is visualized, where a slight change of input data can cause a big fluctuation in the loss landscape. This is an essential factor that leads to the instability of prompt tuning. Based on this observation, we introduce perturbation-based regularizers, which can smooth the loss landscape, into prompt tuning. We propose a new algorithm, called Prompt Tuning with Perturbation-based regularizer~(PTP), which can not only alleviate training instability dramatically but also boost the performance of prompt tuning. We design two kinds of perturbation-based regularizers, including random-noise-based and adversarial-based. In particular, our proposed perturbations are flexible on both text space and embedding space. Extensive experiments show the effectiveness of our proposed methods in stabilizing the training. Our new algorithms improve the state-of-the-art prompt tuning methods by 1.94\% and 2.34\% on SuperGLUE and FewGLUE benchmarks, respectively.

翻译：近期研究表明，在自然语言理解下游任务中，提示调优相比微调能更充分地发挥大型语言模型的能力。然而现有提示调优方法存在训练不稳定性问题，不同随机种子下的评分方差较大。针对这一关键问题，我们首先通过研究发现，标准提示调优的损失曲面在可视化时呈现陡峭特征，输入数据的微小变化即可引发损失曲面的显著波动。这是导致提示调优不稳定性的关键因素。基于此发现，我们将能平滑损失曲面的扰动正则化器引入提示调优。我们提出新算法——基于扰动正则化器的提示调优（PTP），该算法不仅能显著缓解训练不稳定性，还能提升提示调优性能。我们设计了随机噪声型和对抗型两类扰动正则化器，特别地，所提出的扰动在文本空间和嵌入空间中均具有灵活性。大量实验证明我们的方法在稳定训练方面的有效性。新算法在SuperGLUE和FewGLUE基准测试上分别将现有最优提示调优方法提升1.94%和2.34%。

0

相关内容

tuning

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

功能化介孔聚合物/离子液体支撑液膜的制备及其气体分离和稳定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

含刚性疏水基可热/盐诱导增稠聚合物合成及溶液增稠行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

全固态核壳量子点敏化TiO2纳米管阵列太阳能电池的制备及光电化学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

iPS细胞端粒重编程及其稳定性维持的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

自组装太阳能电池（二）

国家自然科学基金

0+阅读 · 2012年12月31日

CO2诱导纳米乳液介质内中药纳米结构脂质载体的绿色组装、微结构调控与构效关系

国家自然科学基金

0+阅读 · 2012年12月31日

多重电荷分离界面结构的有机薄膜太阳能电池及其机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

手性多孔有机无机杂化配位聚合物材料的离子热合成与性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

钙基铁电材料

国家自然科学基金

0+阅读 · 2009年12月31日

Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Arxiv

0+阅读 · 2023年6月16日

Improving Training Stability for Multitask Ranking Models in Recommender Systems

Arxiv

0+阅读 · 2023年6月15日

Prompt Performance Prediction for Generative IR

Arxiv

0+阅读 · 2023年6月15日

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

Arxiv

0+阅读 · 2023年6月14日

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation

Arxiv

0+阅读 · 2023年6月14日

Bandits with Replenishable Knapsacks: the Best of both Worlds

Arxiv

0+阅读 · 2023年6月14日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

最新内容

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

1+阅读 · 今天14:49

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

1+阅读 · 今天14:47

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

1+阅读 · 今天14:45

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

2+阅读 · 今天14:22

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

3+阅读 · 今天13:50

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

2+阅读 · 今天13:33

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

2+阅读 · 今天13:30

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

2+阅读 · 今天13:28

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

2+阅读 · 今天13:13

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

专知会员服务

1+阅读 · 今天13:10

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

5+阅读 · 6月16日

多模态代码智能综述：从视觉输入到可执行代码系统

多模态代码智能综述：从视觉输入到可执行代码系统

专知会员服务

7+阅读 · 6月16日

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

5+阅读 · 6月16日

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

5+阅读 · 6月16日

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

15+阅读 · 6月16日

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

学习数据的几何：形状空间分析数学综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Arxiv

0+阅读 · 2023年6月16日

Improving Training Stability for Multitask Ranking Models in Recommender Systems

Arxiv

0+阅读 · 2023年6月15日

Prompt Performance Prediction for Generative IR

Arxiv

0+阅读 · 2023年6月15日

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

Arxiv

0+阅读 · 2023年6月14日

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation

Arxiv

0+阅读 · 2023年6月14日

Bandits with Replenishable Knapsacks: the Best of both Worlds

Arxiv

0+阅读 · 2023年6月14日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

相关基金

功能化介孔聚合物/离子液体支撑液膜的制备及其气体分离和稳定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

含刚性疏水基可热/盐诱导增稠聚合物合成及溶液增稠行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

全固态核壳量子点敏化TiO2纳米管阵列太阳能电池的制备及光电化学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

iPS细胞端粒重编程及其稳定性维持的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

自组装太阳能电池（二）

国家自然科学基金

0+阅读 · 2012年12月31日

CO2诱导纳米乳液介质内中药纳米结构脂质载体的绿色组装、微结构调控与构效关系

国家自然科学基金

0+阅读 · 2012年12月31日

多重电荷分离界面结构的有机薄膜太阳能电池及其机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

手性多孔有机无机杂化配位聚合物材料的离子热合成与性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

钙基铁电材料

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员