Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks. Neural machine translation (NMT) is one such task that LLMs have been applied to with great success. However, little research has focused on applying LLMs to the more difficult subset of NMT called simultaneous translation (SimulMT), where translation begins before the entire source context is available to the model. In this paper, we address key challenges facing LLMs fine-tuned for SimulMT, validate classical SimulMT concepts and practices in the context of LLMs, explore adapting LLMs that are fine-tuned for NMT to the task of SimulMT, and introduce Simul-LLM, the first open-source fine-tuning and evaluation pipeline development framework for LLMs focused on SimulMT.
翻译:拥有数十亿参数并经过海量数据预训练的大型语言模型(LLMs)现已在多种下游自然语言处理任务中展现出接近或超越当前最优水平的性能。神经机器翻译(NMT)正是LLMs成功应用的任务之一。然而,目前鲜有研究将LLMs应用于更具挑战性的NMT子领域——同声传译(SimulMT),该任务要求在模型获取完整源语言上下文之前即开始翻译。本文针对面向SimulMT微调的LLMs所面临的核心挑战展开研究,验证了经典SimulMT概念与技术在LLMs语境中的有效性,探索了将面向NMT微调的LLMs适配至SimulMT任务的方法,并首次提出专注于SimulMT的开源LLM微调与评估流程开发框架Simul-LLM。