LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering

Multiple Choice Question Answering (MCQA) is an important problem with numerous real-world applications, such as medicine, law, and education. The high cost of building MCQA datasets makes few-shot learning pivotal in this domain. While Large Language Models (LLMs) can enable few-shot learning, their direct application in real-world scenarios is often hindered by their high computational cost. To address this challenge, we propose a simple yet effective approach that uses LLMs for data generation and scoring. Our approach utilizes LLMs to create MCQA data which contains questions and choices, and to assign probability scores to the generated choices. We then use the generated data and LLM-assigned scores to finetune a smaller and more efficient encoder-only model, DeBERTa-v3-base by leveraging distillation loss. Extensive experiments on the Massive Multitask Language Understanding (MMLU) benchmark demonstrate that our method improves accuracy from 28.9% to 39.3%, representing a gain of over 10% compared to a baseline finetuned directly on 5-shot examples. This shows the effectiveness of LLM-driven data generation and knowledge distillation for few-shot MCQA.

翻译：多项选择题问答（MCQA）是一个具有众多实际应用的重要问题，例如在医学、法律和教育领域。构建MCQA数据集的高昂成本使得少样本学习在该领域至关重要。虽然大型语言模型（LLMs）能够实现少样本学习，但其高昂的计算成本常常阻碍了它们在现实场景中的直接应用。为应对这一挑战，我们提出了一种简单而有效的方法，利用LLMs进行数据生成和评分。我们的方法利用LLMs创建包含问题和选项的MCQA数据，并为生成的选项分配概率分数。随后，我们利用生成的数据和LLM分配的分数，通过知识蒸馏损失对更小、更高效的仅编码器模型DeBERTa-v3-base进行微调。在Massive Multitask Language Understanding（MMLU）基准测试上的大量实验表明，我们的方法将准确率从28.9%提升至39.3%，相较于直接在5样本示例上微调的基线模型，实现了超过10%的性能增益。这证明了LLM驱动的数据生成和知识蒸馏对于少样本MCQA的有效性。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日