Prompt-based learning methods in semi-supervised learning (SSL) settings have been shown to be effective on multiple natural language understanding (NLU) datasets and tasks in the literature. However, manually designing multiple prompts and verbalizers requires domain knowledge and human effort, making it difficult and expensive to scale across different datasets. In this paper, we propose two methods to automatically design multiple prompts and integrate automatic verbalizer in SSL settings without sacrificing performance. The first method uses various demonstration examples with learnable continuous prompt tokens to create diverse prompt models. The second method uses a varying number of soft prompt tokens to encourage language models to learn different prompts. For the verbalizer, we use the prototypical verbalizer to replace the manual one. In summary, we obtained the best average accuracy of 73.2% (a relative improvement of 2.52% over even the previous state-of-the-art SSL method with manual prompts and verbalizers) in different few-shot learning settings.
翻译:半监督学习环境下的基于提示学习方法已在文献中多个自然语言理解数据集和任务上表现出有效性。然而,手动设计多种提示和谓词器需要领域知识和人工投入,这使得在不同数据集间扩展变得困难且成本高昂。本文提出两种方法,可在不牺牲性能的前提下,在半监督学习环境中自动设计多种提示并集成自动谓词器。第一种方法通过结合多种带可学习连续提示令牌的演示样例来构建多样化提示模型;第二种方法利用可变数量的软提示令牌,引导语言模型学习不同提示。对于谓词器,我们采用原型谓词器替代手动设计。总之,我们在不同小样本学习设置下获得了73.2%的最佳平均准确率(相较先前使用手动提示和谓词器的最先进半监督学习方法,相对提升2.52%)。