Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and their generalizability. By searching through the prompt space, we first validate the assumption that for every instance, there is almost always a lottery prompt that induces the correct prediction from the PLM, and such prompt can be obtained at a low cost thanks to the inherent ability of PLMs. Meanwhile, we find that some strong lottery prompts have high performance over the whole training set, and they are equipped with distinguishable linguistic features. Lastly, we attempt to generalize the searched strong lottery prompts to unseen data with prompt ensembling method without any parameter tuning. Experiments are conducted on various types of NLP classification tasks and demonstrate that the proposed method can achieve comparable results with other gradient-free and optimization-free baselines.
翻译:持续扩展的预训练语言模型(PLMs)对模型适配施加了沉重负担,促使我们寻找比传统微调更高效的替代方案。鉴于提示在零样本设置中的优势以及不同提示间存在的性能波动,我们探讨了实例级提示及其泛化能力。通过搜索提示空间,我们首先验证了一个假设:对于每个实例,几乎总能找到一个能从PLM诱导出正确预测的彩票提示,而得益于PLM的固有特性,这种提示可以低成本获得。同时,我们发现某些强彩票提示在整个训练集上具有高性能,并具备可区分的语言特征。最后,我们尝试通过提示集成方法将搜索到的强彩票提示泛化到未见数据,无需任何参数调优。实验涵盖多种NLP分类任务,结果表明所提方法能够达到与其他无梯度且免优化基线方法相当的性能。