Large language models have become extremely popular recently due to their ability to achieve strong performance on a variety of tasks, such as text generation and rewriting, but their size and computation cost make them difficult to access, deploy, and secure in many settings. This paper investigates whether small, decoder-only language models can provide an efficient alternative for the tasks of grammar correction and text simplification. The experiments in this paper focus on testing small language models out of the box, fine-tuned, and run sequentially on the JFLEG and ASSET datasets using established metrics. The results show that while SLMs may learn certain behaviors well, their performance remains below strong baselines and current LLMs. The results also show that SLMs struggle with retaining meaning and hallucinations. These findings suggest that despite their efficiency advantages, current SLMs are not yet competitive enough with modern LLMs for rewriting, and further advances in training are required for SLMs to close the performance gap between them and today's LLMs.
翻译:近年来,大型语言模型因其在文本生成与重写等多种任务中展现出的强大性能而变得极为流行,但其庞大的参数量与计算成本使得其在许多场景下难以获取、部署与保障安全。本文探究小型仅解码器语言模型能否为语法纠错与文本简化任务提供一种高效的替代方案。本文实验聚焦于测试小型语言模型在JFLEG与ASSET数据集上的开箱即用性能、微调后性能以及顺序执行性能,并采用既定指标进行评估。结果表明,尽管小型语言模型能够较好地学习某些特定行为,但其性能仍低于强基线模型及当前的大型语言模型。结果还显示,小型语言模型在保持语义一致性与避免幻觉生成方面存在困难。这些发现表明,尽管小型语言模型具有效率优势,但当前版本在重写任务中尚无法与现代大型语言模型竞争,需要通过在训练方法上取得进一步进展,才能缩小其与当今大型语言模型之间的性能差距。