Text editing or revision is an essential function of the human writing process. Understanding the capabilities of LLMs for making high-quality revisions and collaborating with human writers is a critical step toward building effective writing assistants. With the prior success of LLMs and instruction tuning, we leverage instruction-tuned LLMs for text revision to improve the quality of user-generated text and improve the efficiency of the process. We introduce CoEdIT, a state-of-the-art text editing model for writing assistance. CoEdIT takes instructions from the user specifying the attributes of the desired text, such as "Make the sentence simpler" or "Write it in a more neutral style," and outputs the edited text. We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing (a total of 82K instructions). Our model (1) achieves state-of-the-art performance on various text editing benchmarks, (2) is competitive with publicly available largest-sized LLMs trained on instructions while being $\sim$60x smaller, (3) is capable of generalizing to unseen edit instructions, and (4) exhibits compositional comprehension abilities to generalize to instructions containing different combinations of edit actions. Through extensive qualitative and quantitative analysis, we show that writers prefer the edits suggested by CoEdIT, relative to other state-of-the-art text editing models. Our code and dataset are publicly available.
翻译:文本编辑或修订是人类写作过程中的核心功能。理解大语言模型进行高质量修订以及与人类写作者协作的能力,是构建高效写作助手的必要步骤。借助大语言模型及指令微调技术的先前成功,我们利用经过指令微调的大语言模型进行文本修订,以提升用户生成文本的质量并优化修订流程效率。本文提出CoEdIT,一种面向写作辅助的先进文本编辑模型。CoEdIT根据用户指定的文本属性指令(例如"简化句子"或"采用更中立的风格")生成编辑后的文本。我们展示了一个经过多样化文本编辑任务特定指令(总计8.2万条)微调的大语言模型。该模型:(1)在多项文本编辑基准测试中达到最优性能;(2)与公开可用的最大规模指令微调大语言模型性能相当,但模型体积缩小约60倍;(3)能够泛化至未见过的编辑指令;(4)具备组合理解能力,可处理包含多种编辑行为组合的指令。通过广泛的定性与定量分析,我们证明了相较于其他先进文本编辑模型,写作者更偏好CoEdIT生成的修订建议。我们的代码与数据集已公开提供。