Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For other kinds of tasks such as question answering, we observe that while MPT can outperform PT in most cases, it does not always outperform multi-task learning. We further provide an in-depth analysis from the perspective of task similarity.
翻译:提示调优(PT)仅针对每个任务微调额外Token序列的嵌入,同时保持预训练语言模型(PLM)冻结,已在少样本学习中展现出显著性能。然而,研究表明PT高度依赖提示嵌入的优质初始化。本文研究元提示调优(MPT),系统探索元学习如何通过学习从其他相关任务初始化提示嵌入来提升(若可能)PT的跨任务泛化能力。我们在广泛的适应设置中,针对不同源/目标任务配置,对一组具有代表性的元学习算法在大量少样本任务上进行了实证分析。通过大量实验与分析,我们验证了MPT的有效性。研究发现,尤其在分类任务中改进效果显著。对于其他类型任务(如问答),我们观察到尽管MPT在多数情况下优于PT,但并非总是优于多任务学习。本文进一步从任务相似性视角进行了深度分析。