Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

Translation-tailored Large language models (LLMs) exhibit remarkable translation capabilities, even competing with supervised-trained commercial translation systems. However, off-target translation remains an unsolved problem, especially for low-resource languages, hindering us from developing accurate LLMs-based translation models. To mitigate the off-target translation problem and enhance the performance of LLMs on translation, recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs by feeding few-shot demonstrations. However, these methods essentially do not improve LLM's ability to follow translation instructions, especially the language direction information. In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs. Specifically, we first tune LLMs with the maximum likelihood estimation loss on the translation dataset to elicit the basic translation capabilities. In the second stage, we construct instruction-conflicting samples by randomly replacing the translation directions with a wrong one within the instruction, and then introduce an extra unlikelihood loss to learn those samples. Experiments on IWSLT and WMT benchmarks upon the LLaMA model spanning 16 zero-shot directions show that, compared to the competitive baseline -- translation-finetuned LLama, our method could effectively reduce the off-target translation ratio (averagely -53.3\%), thus improving translation quality with average +5.7 SacreBLEU and +16.4 BLEURT. Analysis shows that our method could preserve the model's general task performance on AlpacaEval. Code and models will be released at \url{https://github.com/alphadl/LanguageAware_Tuning}.

翻译：翻译导向的大型语言模型展现出卓越的翻译能力，甚至可与监督训练的商业翻译系统相媲美。然而，脱靶翻译问题仍未解决，尤其在低资源语言中严重阻碍了基于LLMs的高精度翻译模型开发。为缓解脱靶翻译问题并提升LLMs的翻译性能，近期研究要么设计高级提示策略以突出翻译指令的功能性，要么通过提供少量样本示例利用LLMs的上下文学习能力。但这些方法本质上并未增强LLMs遵循翻译指令（尤其是语言方向信息）的能力。本文提出一种两阶段微调算法，旨在提升LLMs的指令遵循能力（特别是翻译方向）。具体而言：第一阶段在翻译数据集上使用最大似然估计损失微调LLMs，激发其基础翻译能力；第二阶段构建指令冲突样本（将指令中的翻译方向随机替换为错误方向），并引入额外的非似然损失学习这些样本。在涵盖16个零样本方向的IWSLT和WMT基准数据集上基于LLaMA模型的实验表明，相比竞争基线（翻译微调LLaMA），本方法能有效降低脱靶翻译比例（平均降低53.3%），从而提升翻译质量（平均SacreBLEU提升5.7，BLEURT提升16.4）。分析表明，本方法可保持模型在AlpacaEval上的通用任务性能。代码与模型将于\url{https://github.com/alphadl/LanguageAware_Tuning}开源。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日