Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances. We propose PATRON, a new method that uses prompt-based uncertainty estimation for data selection for pre-trained language model fine-tuning under cold-start scenarios, i.e., no initial labeled data are available. In PATRON, we design (1) a prompt-based uncertainty propagation approach to estimate the importance of data points and (2) a partition-then-rewrite (PTR) strategy to promote sample diversity when querying for annotations. Experiments on six text classification datasets show that PATRON outperforms the strongest cold-start data selection baselines by up to 6.9%. Besides, with 128 labels only, PATRON achieves 91.0% and 92.1% of the fully supervised performance based on vanilla fine-tuning and prompt-based learning respectively. Our implementation of PATRON is available at \url{https://github.com/yueyu1030/Patron}.
翻译:大型语言模型在少样本场景下展现了卓越的性能,但该性能对少样本实例的选择较为敏感。我们提出PATRON——一种新颖的方法,在冷启动场景(即无初始标注数据)下,利用基于提示的不确定性估计为预训练语言模型微调进行数据选择。在PATRON中,我们设计了:(1) 一种基于提示的不确定性传播方法,用于估计数据点的重要程度;(2) 一种划分-重写(PTR)策略,旨在查询标注时提升样本多样性。在六个文本分类数据集上的实验表明,PATRON比最强的冷启动数据选择基线方法性能提升高达6.9%。此外,仅使用128个标签,PATRON在标准微调和基于提示的学习中分别达到了全监督性能的91.0%和92.1%。我们的PATRON实现可在 \url{https://github.com/yueyu1030/Patron} 获取。