Existing approaches to few-shot learning in NLP rely on large language models and fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a simple yet powerful approach to "extreme" few-shot learning, wherein models are exposed to as little as 4 examples per class, based on soft-label prototypes that collectively capture the distribution of different classes across the input domain space. Inspired by previous work (Sucholutsky et al., 2021) on univariate or simple multivariate (synthetic) data, we propose a novel approach that is effective on large, high-dimensional and real-world datasets. We learn soft-label prototypes within a neural framework (DeepSLP) and we experimentally demonstrate that it achieves superior performance on 31/48 tested tasks and few-shot settings while closely matching the performance of strong baselines on the rest. We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per label and present an in-depth analysis of the effectiveness of our approach.
翻译:现有自然语言处理中的少样本学习方法依赖于大型语言模型及其微调,以泛化到分布外数据。本文提出了一种简单而强大的"极端"少样本学习方法——模型每类仅需接触低至4个样本,其核心是基于软标签原型(soft-label prototypes)来整体捕捉输入域空间中不同类别的分布。受先前关于单变量或简单多变量(合成)数据的研究(Sucholutsky等人,2021)启发,我们提出了一种新颖方法,该方法在大规模高维真实世界数据集上表现有效。我们在神经框架(DeepSLP)中学习软标签原型,实验表明该方法在31/48项测试任务和少样本设置中取得了优越性能,并在其余任务上与强基线表现高度接近。我们专注于从极少样本(每个标签4、8、16个)学习先前未见的自然语言处理任务,并对方法的有效性进行了深入分析。