Transformer-Based Learned Optimization - 专知论文

会员服务 ·

0

优化器 · Learning · BFGS · Networking · Neural Networks ·

2023 年 5 月 24 日

Transformer-Based Learned Optimization

翻译：基于Transformer的学习优化方法

Erik Gärtner,Luke Metz,Mykhaylo Andriluka,C. Daniel Freeman,Cristian Sminchisescu

from arxiv, Accepted to the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR) in Vancouver, Canada

We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a Transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization-based approaches, our formulation allows for conditioning across the dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms, as well as on the real world-task of physics-based visual reconstruction of articulated 3d human motion.

翻译：我们提出了一种新的学习优化方法，其中使用神经网络来表示优化器更新步骤的计算过程。通过在一组优化任务上训练优化器的参数，以高效执行最小化为目标，从而学习该优化器参数。我们的创新在于提出了一种名为Optimus的新型神经网络架构，该架构受经典BFGS算法启发用于学习优化器。与BFGS类似，我们通过秩一更新的累加来估计预条件矩阵，但采用基于Transformer的神经网络联合预测这些更新以及步长和方向。与近期几种基于学习优化的方法不同，我们的公式允许在目标问题参数空间的各个维度之间进行条件化处理，同时无需重新训练即可适用于可变维度的优化任务。我们在由传统用于评估优化算法的目标函数组成的基准测试中，以及在基于物理的关节式3D人体运动可视化这一实际任务中，展示了我们方法的优势。

0

相关内容

优化器

干货书！基于单调算子的大规模凸优化，348页pdf

干货书！基于单调算子的大规模凸优化，348页pdf

专知会员服务

50+阅读 · 2022年7月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

矩阵方程秩约束广义最佳逼近理论及应用

国家自然科学基金

1+阅读 · 2013年12月31日

高频微振复合激光焊热-力学效应及接头多尺度疲劳损伤机理

国家自然科学基金

0+阅读 · 2013年12月31日

含掺杂和缺陷石墨烯的电子性质及其高压研究

国家自然科学基金

0+阅读 · 2013年12月31日

蓝紫波段腔增强型光声光谱高灵敏度、高精度测量气溶胶光吸收特性的方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

数据云存储中的安全审计方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

错配修复基因hMLH1在雌激素诱导结肠细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

组织干细胞的神经保护机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

量子discord及其在量子计算中的研究

国家自然科学基金

1+阅读 · 2011年12月31日

ASIC1a对NMDAR的门控作用和脑缺血中神经元的损伤作用及神经保护研究

国家自然科学基金

0+阅读 · 2011年12月31日

半无限规划问题的算法研究及其应用

国家自然科学基金

0+阅读 · 2008年12月31日

Learned Query Superoptimization

Learned Query Superoptimization

Arxiv

0+阅读 · 2023年7月11日

Neural Quantile Optimization for Edge-Cloud Computing

Arxiv

0+阅读 · 2023年7月11日

The Benefits of Model-Based Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2023年7月10日

DADO -- Low-Cost Selection Strategies for Deep Active Design Optimization

Arxiv

0+阅读 · 2023年7月10日

Equivariance with Learned Canonicalization Functions

Arxiv

0+阅读 · 2023年7月7日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

24+阅读 · 2021年8月12日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

Neural Networks

最新内容

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

专知会员服务

2+阅读 · 今天7:13

俄乌无人机战争的六大启示

俄乌无人机战争的六大启示

专知会员服务

4+阅读 · 今天7:07

《无人机空中监控：通信实验洞察》

《无人机空中监控：通信实验洞察》

专知会员服务

3+阅读 · 今天7:05

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

专知会员服务

3+阅读 · 今天6:59

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

12+阅读 · 8月2日

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

5+阅读 · 8月2日

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

10+阅读 · 8月2日

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

12+阅读 · 8月2日

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

6+阅读 · 8月2日

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

10+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

8+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

9+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

8+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

6+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

13+阅读 · 7月31日

相关VIP内容

干货书！基于单调算子的大规模凸优化，348页pdf

干货书！基于单调算子的大规模凸优化，348页pdf

专知会员服务

50+阅读 · 2022年7月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

俄乌无人机战争的六大启示

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

《无人机空中监控：通信实验洞察》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Learned Query Superoptimization

Learned Query Superoptimization

Arxiv

0+阅读 · 2023年7月11日

Neural Quantile Optimization for Edge-Cloud Computing

Arxiv

0+阅读 · 2023年7月11日

The Benefits of Model-Based Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2023年7月10日

DADO -- Low-Cost Selection Strategies for Deep Active Design Optimization

Arxiv

0+阅读 · 2023年7月10日

Equivariance with Learned Canonicalization Functions

Arxiv

0+阅读 · 2023年7月7日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

24+阅读 · 2021年8月12日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

矩阵方程秩约束广义最佳逼近理论及应用

国家自然科学基金

1+阅读 · 2013年12月31日

高频微振复合激光焊热-力学效应及接头多尺度疲劳损伤机理

国家自然科学基金

0+阅读 · 2013年12月31日

含掺杂和缺陷石墨烯的电子性质及其高压研究

国家自然科学基金

0+阅读 · 2013年12月31日

蓝紫波段腔增强型光声光谱高灵敏度、高精度测量气溶胶光吸收特性的方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

数据云存储中的安全审计方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

错配修复基因hMLH1在雌激素诱导结肠细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

组织干细胞的神经保护机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

量子discord及其在量子计算中的研究

国家自然科学基金

1+阅读 · 2011年12月31日

ASIC1a对NMDAR的门控作用和脑缺血中神经元的损伤作用及神经保护研究

国家自然科学基金

0+阅读 · 2011年12月31日

半无限规划问题的算法研究及其应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员