Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their remarkable parameter size and their impressive high requirement of computing resources pose challenges for their practical deployment. Recent research has revealed that specific capabilities of LLMs, such as numerical reasoning, can be transferred to smaller models through distillation. Some studies explore the potential of leveraging LLMs to perform table-based reasoning. Nevertheless, prior to our work, there has been no investigation into the prospect of specialising table reasoning skills in smaller models specifically tailored for table-to-text generation tasks. In this paper, we propose a novel table-based reasoning distillation, with the aim of distilling distilling LLMs into tailored, smaller models specifically designed for table-based reasoning task. Experimental results have shown that a 0.22 billion parameter model (Flan-T5-base) fine-tuned using distilled data, not only achieves a significant improvement compared to traditionally fine-tuned baselines but also surpasses specific LLMs like gpt-3.5-turbo on the scientific table-to-text generation dataset (SciGen). The code and data are released in https://github.com/Bernard-Yang/TableDistill.
翻译:大型语言模型(LLMs)在各类自然语言处理任务中展现出卓越性能。然而,其庞大的参数量和对计算资源的极高需求给实际部署带来了挑战。近期研究表明,LLM的特定能力(如数值推理)可通过蒸馏技术迁移至较小模型。已有研究探索了利用LLM进行基于表格推理的潜力,但在本研究之前,尚未有工作专门针对表格到文本生成任务,探究将表格推理能力特化至较小模型的可能性。本文提出一种新颖的基于表格的推理蒸馏方法,旨在将LLM的表格推理能力蒸馏至专为此任务设计的轻量级模型。实验结果表明,采用蒸馏数据微调的0.22亿参数模型(Flan-T5-base)不仅相较于传统微调基线取得显著提升,更在科学表格到文本生成数据集(SciGen)上超越gpt-3.5-turbo等特定LLM。相关代码与数据已发布于https://github.com/Bernard-Yang/TableDistill。