Prompt tuning is a promising method to fine-tune a pre-trained language model without retraining its large-scale parameters. Instead, it attaches a soft prompt to the input text, whereby downstream tasks can be well adapted by merely learning the embeddings of prompt tokens. Nevertheless, existing methods still suffer from two challenges: (i) they are hard to balance accuracy and efficiency. A longer (shorter) soft prompt generally leads to a better(worse) accuracy but at the cost of more (less) training time. (ii)The performance may not be consistent when adapting to different downstream tasks. We attribute it to the same embedding space but responsible for different requirements of downstream tasks. To address these issues, we propose an Efficient Prompt Tuning method (EPT) by multi-space projection and prompt fusion. Specifically, it decomposes a given soft prompt into a shorter prompt and two low-rank matrices, significantly reducing the training time. Accuracy is also enhanced by leveraging low-rank matrices and the short prompt as additional knowledge sources to enrich the semantics of the original short prompt. In addition, we project the soft prompt into multiple subspaces to improve the performance consistency, and then adaptively learn the combination weights of different spaces through a gating network. Experiments on 13 natural language processing downstream tasks show that our method significantly and consistently outperforms 11 comparison methods with the relative percentage of improvements up to 12.9%, and training time decreased by 14%.
翻译:提示调优是一种无需重新训练大规模参数即可微调预训练语言模型的有效方法。该方法通过向输入文本附加软提示,仅需学习提示标记的嵌入表示即可良好适应下游任务。然而,现有方法仍面临两大挑战:(i) 难以平衡准确性与效率。较长(较短)的软提示通常能带来更好(更差)的准确性,但需付出更多(更少)的训练时间代价。(ii) 适应不同下游任务时性能可能不一致。我们将此归因于同一嵌入空间需满足下游任务的不同需求。为解决这些问题,我们提出一种基于多空间投影与提示融合的高效提示调优方法。具体而言,该方法将给定软提示分解为较短提示和两个低秩矩阵,显著减少了训练时间。通过利用低秩矩阵和短提示作为额外知识源来丰富原始短提示的语义,准确性也得到提升。此外,我们将软提示投影到多个子空间以提升性能一致性,并通过门控网络自适应学习不同空间的组合权重。在13个自然语言处理下游任务上的实验表明,本方法显著且持续优于11种对比方法,相对改进百分比最高达12.9%,同时训练时间减少14%。