Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capabilities, due to overspecialization. In this paper, we provide a closer look at this problem. We start by showing that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50. This method also outperforms few-shot prompting and eliminates the need for post-processing or in-context examples. However, we show that finetuning generally degrades few-shot performance, hindering adaptation capabilities. Finally, to obtain the best of both worlds, we propose a simple approach that incorporates few-shot examples during finetuning. Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.
翻译:大型语言模型(LLM)为机器翻译(MT)提供了有前景的途径。然而,当前基于LLM的机器翻译系统较为脆弱:其有效性高度依赖于少量示例的选择,且由于过度生成,通常需要额外的后处理。基于翻译指令的微调等替代方案计算成本高昂,且可能因过度专门化而削弱上下文学习能力。本文对这一问题进行了更深入的探究。首先,我们证明基于适配器的LoRA微调在将训练参数量减少50倍的同时,能达到与传统微调相当的性能。该方法还优于少量样本提示,并消除了后处理或上下文示例的需求。然而,我们发现微调通常会降低少量样本性能,从而阻碍适应能力。最后,为兼顾两者优势,我们提出一种简单方法,在微调过程中融入少量样本示例。在10个语言对上的实验表明,所提方法在保留微调额外优势的同时,恢复了原有的少量样本能力。