Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-like models can also learn universal tasks, which allow them to do any text classification task without requiring fine-tuning (zeroshot classification) or to learn new tasks with only a few examples (fewshot), while being significantly more efficient than generative LLMs. This paper (1) explains how Natural Language Inference (NLI) can be used as a universal classification task that follows similar principles as instruction fine-tuning of generative LLMs, (2) provides a step-by-step guide with reusable Jupyter notebooks for building a universal classifier, and (3) shares the resulting universal classifier that is trained on 33 datasets with 389 diverse classes. Parts of the code we share has been used to train our older zeroshot classifiers that have been downloaded more than 55 million times via the Hugging Face Hub as of December 2023. Our new classifier improves zeroshot performance by 9.4%.
翻译:生成式大型语言模型(LLMs)凭借文本生成的通用性,已成为少样本和零样本学习的主流选择。然而,许多用户仅需自动化分类任务,并不需要生成式LLM的广泛能力。更小型的类BERT模型同样能学习通用任务,可在无需微调(零样本分类)或仅需少量样本(少样本)的情况下完成任何文本分类任务,同时其效率显著高于生成式LLM。本文(1)阐释如何将自然语言推理(NLI)作为通用分类任务,遵循与生成式LLM指令微调相似的原则;(2)提供包含可复用Jupyter Notebooks的分步指南,用于构建通用分类器;(3)分享基于33个数据集、涵盖389个不同类别训练而成的通用分类器。截至2023年12月,我们共享的部分代码已用于训练早期的零样本分类器,这些分类器在Hugging Face Hub上的下载量超过5500万次。本文的新分类器将零样本性能提升了9.4%。