Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions - 专知论文

会员服务 ·

0

Performer · 语言模型化 · MoDELS · 可理解性 · Analysis ·

2023 年 5 月 24 日

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

翻译：通过多语言微调与翻译指令激发大语言模型的翻译能力

Jiahuan Li,Hao Zhou,Shujian Huang,Shanbo Chen,Jiajun Chen

Large-scale Pretrained Language Models~(LLMs), such as ChatGPT and GPT4, have shown strong abilities in multilingual translations, without being explicitly trained on parallel corpora. It is interesting how the LLMs obtain their ability to carry out translation instructions for different languages. In this paper, we present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation following given instructions. Firstly, we show that the multilingual LLMs have stronger translation abilities than previously demonstrated. For a certain language pair, the performance depends on both the language families and the amount of data used in the pretraining phase. Secondly, we find that LLMs' ability to carry out translation instructions relies on the understanding of translation instruction and the alignment among different languages. With proper enhancement, LLMs could perform the translation task well even for those language pairs unseen during the instruction tuning phase.

翻译：大规模预训练语言模型（如ChatGPT和GPT4）表现出强大的多语言翻译能力，尽管未在平行语料上进行显式训练。令人感兴趣的是，这些大语言模型是如何获得执行不同语言翻译指令的能力的。在本文中，我们通过对多语言预训练模型XGLM-7B进行微调，使其能够按照给定指令执行多语言翻译，从而进行详细分析。首先，我们展示多语言大语言模型具有比先前研究更强的翻译能力。对于特定语言对，其性能取决于语言家族及预训练阶段使用的数据量。其次，我们发现大语言模型执行翻译指令的能力取决于对翻译指令的理解以及不同语言间的对齐。通过适当的增强，大语言模型甚至能够为指令微调阶段未见过的语言对良好地执行翻译任务。

0

相关内容

Performer

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

86+阅读 · 2023年6月19日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

高核币金属簇合物的设计合成与性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

慕士塔格冰芯硝酸盐氮氧同位素记录的过去500年来大气活性氮含量变化研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于Petri网与协同过滤的云上Web服务可信性量化分析与预测的研究

国家自然科学基金

0+阅读 · 2014年12月31日

电化学法制备金属纳米粒子/金属有机骨架复合膜及其电催化性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

食品安全网络舆情演化机理与应对策略研究

国家自然科学基金

0+阅读 · 2013年12月31日

双液异种金属复合界面凝固行为及梯度复合层形成机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

固体电化学定硫传感器辅助电极的研究

国家自然科学基金

0+阅读 · 2012年12月31日

微流控芯片—毛细管电泳—微液滴喷射雾化器等离子体质谱联机进行单细胞内金属形态分析的研究

国家自然科学基金

0+阅读 · 2012年12月31日

超临界流体辅助化学镀涤纶织物金属镀层制备及界面结合性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

Arxiv

0+阅读 · 2023年7月11日

Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

Arxiv

0+阅读 · 2023年7月9日

Evaluating the Capability of Large-scale Language Models on Chinese Grammatical Error Correction Task

Arxiv

0+阅读 · 2023年7月8日

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Arxiv

0+阅读 · 2023年7月7日

Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model

Arxiv

0+阅读 · 2023年7月7日

BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

Arxiv

0+阅读 · 2023年7月7日

Guiding Large Language Models via Directional Stimulus Prompting

Arxiv

1+阅读 · 2023年7月7日

A Survey on Multimodal Large Language Models

Arxiv

25+阅读 · 2023年6月23日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Arxiv

16+阅读 · 2023年2月9日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

VIP会员

文章信息

相关主题

语言模型化

最新内容

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

4+阅读 · 今天2:42

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

1+阅读 · 今天2:37

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

5+阅读 · 今天2:23

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

6+阅读 · 今天2:21

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

2+阅读 · 今天1:46

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

6+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

4+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

4+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

4+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

2+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

11+阅读 · 7月31日

《美战争部指令文件：网络空间效应与使能能力测试评估》

《美战争部指令文件：网络空间效应与使能能力测试评估》

专知会员服务

8+阅读 · 7月31日

《史诗怒火行动：多域前瞻评估》49页报告

《史诗怒火行动：多域前瞻评估》49页报告

专知会员服务

8+阅读 · 7月31日

《英国防部：未来空战系统数字化战略》33页

《英国防部：未来空战系统数字化战略》33页

专知会员服务

5+阅读 · 7月31日

《面向自主飞行网络的智能体人工智能架构》

《面向自主飞行网络的智能体人工智能架构》

专知会员服务

8+阅读 · 7月31日

相关VIP内容

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

86+阅读 · 2023年6月19日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

从采集到决策：美军视角下的战术情报范式重构

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

Arxiv

0+阅读 · 2023年7月11日

Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

Arxiv

0+阅读 · 2023年7月9日

Evaluating the Capability of Large-scale Language Models on Chinese Grammatical Error Correction Task

Arxiv

0+阅读 · 2023年7月8日

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Arxiv

0+阅读 · 2023年7月7日

Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model

Arxiv

0+阅读 · 2023年7月7日

BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

Arxiv

0+阅读 · 2023年7月7日

Guiding Large Language Models via Directional Stimulus Prompting

Arxiv

1+阅读 · 2023年7月7日

A Survey on Multimodal Large Language Models

Arxiv

25+阅读 · 2023年6月23日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Arxiv

16+阅读 · 2023年2月9日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

相关基金

高核币金属簇合物的设计合成与性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

慕士塔格冰芯硝酸盐氮氧同位素记录的过去500年来大气活性氮含量变化研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于Petri网与协同过滤的云上Web服务可信性量化分析与预测的研究

国家自然科学基金

0+阅读 · 2014年12月31日

电化学法制备金属纳米粒子/金属有机骨架复合膜及其电催化性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

食品安全网络舆情演化机理与应对策略研究

国家自然科学基金

0+阅读 · 2013年12月31日

双液异种金属复合界面凝固行为及梯度复合层形成机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

固体电化学定硫传感器辅助电极的研究

国家自然科学基金

0+阅读 · 2012年12月31日

微流控芯片—毛细管电泳—微液滴喷射雾化器等离子体质谱联机进行单细胞内金属形态分析的研究

国家自然科学基金

0+阅读 · 2012年12月31日

超临界流体辅助化学镀涤纶织物金属镀层制备及界面结合性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员