Split and Rephrase with Large Language Models

The Split and Rephrase (SPRP) task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. It is also a valuable testbed to evaluate natural language processing models, as it requires modelling complex grammatical aspects. In this work, we evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics, although still lagging in terms of splitting compliance. Results from two human evaluations further support the conclusions drawn from automated metric results. We provide a comprehensive study that includes prompting variants, domain shift, fine-tuned pretrained language models of varying parameter size and training data volumes, contrasted with both zero-shot and few-shot approaches on instruction-tuned language models. Although the latter were markedly outperformed by fine-tuned models, they may constitute a reasonable off-the-shelf alternative. Our results provide a fine-grained analysis of the potential and limitations of large language models for SPRP, with significant improvements achievable using relatively small amounts of training data and model parameters overall, and remaining limitations for all models on the task.

翻译：拆分与重述（SPRP）任务旨在将复杂句子拆分为一系列语法正确的短句，同时保持原意，该任务能够促进人类和机器对复杂文本的处理。它同样是评估自然语言处理模型的重要测试平台，因其需要对复杂的语法层面进行建模。本研究评估了大型语言模型在该任务上的表现，结果表明虽然模型在拆分合规性方面仍有不足，但在核心指标上能较现有最佳成果实现显著提升。两项人工评估结果进一步佐证了自动化指标得出的结论。我们通过系统研究涵盖了提示变体、领域偏移、不同参数量与训练数据量的微调预训练语言模型，并与指令调优语言模型的零样本及少样本方法进行对比。尽管后者的表现显著逊于微调模型，它们仍可作为合理的即用型替代方案。本研究结果对大型语言模型在SPRP任务中的潜力与局限进行了细粒度分析：使用相对较少的训练数据和模型参数即可实现显著改进，但所有模型在该任务上仍存在固有局限。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日