Simultaneous Machine Translation with Large Language Models

Large language models (LLM) have demonstrated their abilities to solve various natural language processing tasks through dialogue-based interactions. For instance, research indicates that LLMs can achieve competitive performance in offline machine translation tasks for high-resource languages. However, applying LLMs to simultaneous machine translation (SimulMT) poses many challenges, including issues related to the training-inference mismatch arising from different decoding patterns. In this paper, we explore the feasibility of utilizing LLMs for SimulMT. Building upon conventional approaches, we introduce a simple yet effective mixture policy that enables LLMs to engage in SimulMT without requiring additional training. Furthermore, after Supervised Fine-Tuning (SFT) on a mixture of full and prefix sentences, the model exhibits significant performance improvements. Our experiments, conducted with Llama2-7B-chat on nine language pairs from the MUST-C dataset, demonstrate that LLM can achieve translation quality and latency comparable to dedicated SimulMT models.

翻译：大规模语言模型（LLM）已展现出通过对话式交互解决各类自然语言处理任务的能力。例如，研究表明，LLM在高资源语言的离线机器翻译任务中能够取得具有竞争力的表现。然而，将LLM应用于同步机器翻译（SimulMT）仍面临诸多挑战，包括因解码模式差异导致的训练-推理不匹配问题。本文探讨了利用LLM进行同步机器翻译的可行性。在传统方法基础上，我们提出了一种简单而有效的混合策略，使LLM无需额外训练即可参与同步机器翻译。此外，通过对完整句子和前缀句子的混合数据进行监督微调（SFT），该模型展现出显著的性能提升。我们在MUST-C数据集的九个语言对上使用Llama2-7B-chat模型进行的实验表明，LLM能够达到与专用SimulMT模型相当的翻译质量和延迟水平。

相关内容

Machine Translation

关注 0

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日