Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning. Nonetheless, they have limited applications in simultaneous machine translation (SiMT), currently dominated by encoder-decoder transformers. This study demonstrates that, after fine-tuning on a small dataset comprising causally aligned source and target sentence pairs, a pre-trained open-source LLM can control input segmentation directly by generating a special "wait" token. This obviates the need for a separate policy and enables the LLM to perform English-German and English-Russian SiMT tasks with BLEU scores that are comparable to those of specific state-of-the-art baselines. We also evaluated closed-source models such as GPT-4, which displayed encouraging results in performing the SiMT task without prior training (zero-shot), indicating a promising avenue for enhancing future SiMT systems.
翻译:仅解码器的大语言模型(LLMs)近期在文本生成与推理方面展现出令人瞩目的能力。然而,在同声传译(SiMT)领域,这类模型的应用仍较为有限,该领域目前主要由编码器-解码器架构的Transformer模型主导。本研究表明,在由因果对齐的源语言与目标语言句对构成的小规模数据集上进行微调后,预训练的开源大语言模型可通过生成特殊的“等待”(wait)标记直接控制输入切分。这一方法无需独立策略模块,即可使该大语言模型在英德与英俄同声传译任务上达到与特定先进基线模型相当的BLEU值。此外,我们评估了GPT-4等闭源模型,发现其在未经先验训练(零样本)的情况下执行同声传译任务时展现出令人鼓舞的结果,这为未来提升同声传译系统性能提供了有前景的方向。