Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-like models can also learn universal tasks, which allow them to do any text classification task without requiring fine-tuning (zeroshot classification) or to learn new tasks with only a few examples (fewshot), while being significantly more efficient than generative LLMs. This paper (1) explains how Natural Language Inference (NLI) can be used as a universal classification task that follows similar principles as instruction fine-tuning of generative LLMs, (2) provides a step-by-step guide with reusable Jupyter notebooks for building a universal classifier, and (3) shares the resulting universal classifier that is trained on 33 datasets with 389 diverse classes. Parts of the code we share has been used to train our older zeroshot classifiers that have been downloaded more than 55 million times via the Hugging Face Hub as of December 2023. Our new classifier improves zeroshot performance by 9.4%.
翻译:生成式大语言模型因其文本生成的通用性,已成为小样本和零样本学习的主流选择。然而,当许多用户仅希望自动化分类任务时,他们并不需要生成式LLM的广泛能力。较小的类BERT模型也能学习通用任务,从而无需微调即可完成任何文本分类任务(零样本分类),或仅需少量示例即可学习新任务(小样本学习),同时其效率显著高于生成式LLM。本文(1)解释了如何将自然语言推理(NLI)作为一种遵循与生成式LLM指令微调类似原理的通用分类任务;(2)提供了分步指南及可复用的Jupyter笔记本,用于构建通用分类器;以及(3)分享了基于33个数据集(含389个不同类别)训练而成的通用分类器结果。我们分享的部分代码曾用于训练早期的零样本分类器,截至2023年12月,这些分类器在Hugging Face Hub上的下载量已超过5500万次。我们的新分类器将零样本性能提升了9.4%。