Accelerating Transformer Inference for Translation via Parallel Decoding - 专知论文

会员服务 ·

0

解码 · 推断 · MoDELS · 变换 · 机器翻译 ·

2023 年 5 月 17 日

Accelerating Transformer Inference for Translation via Parallel Decoding

翻译：通过并行解码加速机器翻译中的Transformer推理

Andrea Santilli,Silvio Severino,Emilian Postolache,Valentino Maiorca,Michele Mancusi,Riccardo Marin,Emanuele Rodolà

from arxiv, Accepted at ACL 2023 main conference

Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT). The community proposed specific network architectures and learning-based methods to solve this issue, which are expensive and require changes to the MT model, trading inference speed at the cost of the translation quality. In this paper, we propose to address the problem from the point of view of decoding algorithms, as a less explored but rather compelling direction. We propose to reframe the standard greedy autoregressive decoding of MT with a parallel formulation leveraging Jacobi and Gauss-Seidel fixed-point iteration methods for fast inference. This formulation allows to speed up existing models without training or modifications while retaining translation quality. We present three parallel decoding algorithms and test them on different languages and models showing how the parallelization introduces a speedup up to 38% w.r.t. the standard autoregressive decoding and nearly 2x when scaling the method on parallel resources. Finally, we introduce a decoding dependency graph visualizer (DDGviz) that let us see how the model has learned the conditional dependence between tokens and inspect the decoding procedure.

翻译：自回归解码限制了Transformer在机器翻译（MT）中的效率。学界提出了特定的网络架构和基于学习的方法来解决这一问题，但这些方法成本高昂且需要修改MT模型，以牺牲翻译质量为代价换取推理速度。本文从一个较少探索但颇具前景的方向——解码算法——出发来应对该问题。我们提出将标准贪婪自回归解码的MT过程重构为并行形式，利用雅可比和高斯-赛德尔不动点迭代方法实现快速推理。这种形式无需训练或模型修改即可加速现有模型，同时保持翻译质量。我们提出了三种并行解码算法，在不同语言和模型上进行了测试，结果表明，与标准自回归解码相比，并行化实现了最高38%的加速比，当在并行资源上扩展该方法时，加速比接近2倍。最后，我们引入了解码依赖关系图可视化工具（DDGviz），该工具可揭示模型如何学习词元间的条件依赖关系，并帮助检查解码过程。

0

相关内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

SIGIR2019 接收论文列表

SIGIR2019 接收论文列表

专知

18+阅读 · 2019年4月20日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

基于氮杂环铱配合物的G4-DNA结构探针的合成及作用机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于目标导向MI-EEGas的上肢运动康复方法双向适应性研究

国家自然科学基金

0+阅读 · 2014年12月31日

混联双级NTP系统协同DPF和SCR同步降低柴油机PM和NOx排放的化学反应机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

高载药率的多重靶向抗肿瘤纳米药物载体研究

国家自然科学基金

0+阅读 · 2012年12月31日

非监督高光谱图像实时目标探测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高压下Cd,In,Pb基半导体纳米晶结构、性质及荧光增强效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

氟化物和氟氧化物基白光LED用荧光粉的研制和设计

国家自然科学基金

0+阅读 · 2011年12月31日

有源光纤环形腔内相位调制产生RoF超连续光源研究

国家自然科学基金

0+阅读 · 2009年12月31日

SDF-1/CXCR4信号通路的干预及调节关节软骨退变的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Trainable Transformer in Transformer

Arxiv

1+阅读 · 2023年7月3日

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design

Arxiv

0+阅读 · 2023年7月3日

Zero-Shot Cross-Lingual Summarization via Large Language Models

Zero-Shot Cross-Lingual Summarization via Large Language Models

Arxiv

0+阅读 · 2023年7月3日

RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations

Arxiv

0+阅读 · 2023年7月2日

DETRs with Collaborative Hybrid Assignments Training

Arxiv

0+阅读 · 2023年7月2日

Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

Arxiv

0+阅读 · 2023年7月1日

Self-Supervised Query Reformulation for Code Search

Arxiv

0+阅读 · 2023年7月1日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

VIP会员

文章信息

相关主题

最新内容

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

9+阅读 · 8月2日

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

4+阅读 · 8月2日

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

7+阅读 · 8月2日

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

9+阅读 · 8月2日

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

4+阅读 · 8月2日

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

8+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

6+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

6+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

6+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

4+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

13+阅读 · 7月31日

《美战争部指令文件：网络空间效应与使能能力测试评估》

《美战争部指令文件：网络空间效应与使能能力测试评估》

专知会员服务

9+阅读 · 7月31日

《史诗怒火行动：多域前瞻评估》49页报告

《史诗怒火行动：多域前瞻评估》49页报告

专知会员服务

12+阅读 · 7月31日

《英国防部：未来空战系统数字化战略》33页

《英国防部：未来空战系统数字化战略》33页

专知会员服务

7+阅读 · 7月31日

《面向自主飞行网络的智能体人工智能架构》

《面向自主飞行网络的智能体人工智能架构》

专知会员服务

10+阅读 · 7月31日

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

从采集到决策：美军视角下的战术情报范式重构

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

SIGIR2019 接收论文列表

SIGIR2019 接收论文列表

专知

18+阅读 · 2019年4月20日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Trainable Transformer in Transformer

Arxiv

1+阅读 · 2023年7月3日

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design

Arxiv

0+阅读 · 2023年7月3日

Zero-Shot Cross-Lingual Summarization via Large Language Models

Zero-Shot Cross-Lingual Summarization via Large Language Models

Arxiv

0+阅读 · 2023年7月3日

RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations

Arxiv

0+阅读 · 2023年7月2日

DETRs with Collaborative Hybrid Assignments Training

Arxiv

0+阅读 · 2023年7月2日

Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

Arxiv

0+阅读 · 2023年7月1日

Self-Supervised Query Reformulation for Code Search

Arxiv

0+阅读 · 2023年7月1日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

相关基金

基于氮杂环铱配合物的G4-DNA结构探针的合成及作用机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于目标导向MI-EEGas的上肢运动康复方法双向适应性研究

国家自然科学基金

0+阅读 · 2014年12月31日

混联双级NTP系统协同DPF和SCR同步降低柴油机PM和NOx排放的化学反应机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

高载药率的多重靶向抗肿瘤纳米药物载体研究

国家自然科学基金

0+阅读 · 2012年12月31日

非监督高光谱图像实时目标探测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高压下Cd,In,Pb基半导体纳米晶结构、性质及荧光增强效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

氟化物和氟氧化物基白光LED用荧光粉的研制和设计

国家自然科学基金

0+阅读 · 2011年12月31日

有源光纤环形腔内相位调制产生RoF超连续光源研究

国家自然科学基金

0+阅读 · 2009年12月31日

SDF-1/CXCR4信号通路的干预及调节关节软骨退变的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员