Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For other kinds of tasks such as question answering, we observe that while MPT can outperform PT in most cases, it does not always outperform multi-task learning. We further provide an in-depth analysis from the perspective of task similarity.
翻译:提示调优(PT)技术仅针对每个任务调整额外token序列的嵌入表示,同时保持预训练语言模型(PLM)冻结,已在少样本学习中展现出卓越性能。然而研究表明,PT高度依赖提示嵌入的良好初始化。本研究通过元提示调优(MPT),系统探究元学习能否通过从其他相关任务学习初始化提示嵌入,进而提升PT的跨任务泛化能力。我们在大量少样本任务上,针对不同源/目标任务配置的多种自适应设置,对代表性元学习算法进行了实证分析。通过广泛的实验与分析,我们验证了MPT的有效性,并发现其在分类任务上的提升尤为显著。对于问答等其他类型任务,我们观察到MPT在多数情况下优于PT,但并非总能超越多任务学习。我们还从任务相似性角度进行了深入分析。