Large Language Models (LLMs) have demonstrated remarkable performance in various tasks and gained significant attention. LLMs are also used for local sequence transduction tasks, including grammatical error correction (GEC) and formality style transfer, where most tokens in a source text are kept unchanged. However, it is inefficient to generate all target tokens because a prediction error of a target token may cause a catastrophe in predicting subsequent tokens and because the computational cost grows quadratically with the target sequence length. This paper proposes to predict a set of edit operations for the source text for local sequence transduction tasks. Representing an edit operation with a span of the source text and changed tokens, we can reduce the length of the target sequence and thus the computational cost for inference. We apply instruction tuning for LLMs on the supervision data of edit operations. Experiments show that the proposed method achieves comparable performance to the baseline in four tasks, paraphrasing, formality style transfer, GEC, and text simplification, despite reducing the length of the target text by as small as 21\%. Furthermore, we report that the instruction tuning with the proposed method achieved the state-of-the-art performance in the four tasks.
翻译:大型语言模型(LLMs)在各种任务中展现出卓越性能,并获得了广泛关注。LLMs也用于局部序列转换任务,包括语法错误纠正(GEC)和形式风格迁移,其中源文本中的大部分标记保持不变。然而,生成所有目标标记效率低下,因为目标标记的预测错误可能导致后续标记预测的灾难性失败,且计算成本随目标序列长度呈二次增长。本文提出针对局部序列转换任务,为源文本预测一组编辑操作。通过用源文本的跨度及更改后的标记表示编辑操作,我们可以减少目标序列的长度,从而降低推理的计算成本。我们在编辑操作的监督数据上对LLMs应用指令微调。实验表明,尽管目标文本长度减少了21%,所提方法在释义、形式风格迁移、GEC和文本简化四项任务中仍达到了与基线相当的性能。此外,我们报告称,采用所提方法的指令微调在四项任务中均实现了最先进的性能。