The Curse of Unrolling: Rate of Differentiating Through Optimization - 专知论文

会员服务 ·

0

优化器 · 雅克比 · Learning · 学习率 · Analysis ·

2023 年 5 月 19 日

The Curse of Unrolling: Rate of Differentiating Through Optimization

翻译：展开的诅咒：优化微分及其收敛速率

Damien Scieur,Quentin Bertrand,Gauthier Gidel,Fabian Pedregosa

Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few. Unrolled differentiation is a popular heuristic that approximates the solution using an iterative solver and differentiates it through the computational path. This work provides a non-asymptotic convergence-rate analysis of this approach on quadratic objectives for gradient descent and the Chebyshev method. We show that to ensure convergence of the Jacobian, we can either 1) choose a large learning rate leading to a fast asymptotic convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence. We refer to this phenomenon as the curse of unrolling. Finally, we discuss open problems relative to this approach, such as deriving a practical update rule for the optimal unrolling strategy and making novel connections with the field of Sobolev orthogonal polynomials.

翻译：计算优化问题解的雅可比矩阵是机器学习中的核心问题，应用涵盖超参数优化、元学习、优化作为网络层以及数据集蒸馏等。展开微分是一种流行的启发式方法，通过迭代求解器近似原问题解，并沿计算路径进行微分。本研究针对二次型目标函数，对梯度下降法和切比雪夫法展开微分方法进行非渐进收敛速率分析。结果表明，为确保雅可比矩阵收敛，研究者面临两种选择：其一，采用大学习率以获得快速渐近收敛，但需接受算法可能经历任意长的初始收敛阶段；其二，采用小学习率实现即时但较慢的收敛。我们将这种现象称为"展开的诅咒"。最后，本文讨论该方法的开放性问题，包括推导最优展开策略的实用更新规则，以及与索伯列夫正交多项式领域建立新的关联。

0

相关内容

优化器

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

专知会员服务

49+阅读 · 2022年2月19日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

TREM2基因在晚发型AD（LOAD）中介导Aβ吞噬与炎症调节的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

CD2相关蛋白在阿尔茨海默病Tau蛋白介导的神经元损害中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

TR4翻译后修饰与宫内发育迟缓大鼠代谢综合征的易感机制

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

人巨细胞病毒潜伏感染的自噬调控及相关IE2-Akt-Beclin 1通路的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

有限长区域中的空间耦合多元Rateless码研究

国家自然科学基金

0+阅读 · 2012年12月31日

Kirchhoff型拟线性Schrodinger方程及其耦合系统的非光滑变分方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

视黄醛蛋白Leptosphaeria Rhodopsin中的质子跨膜传递机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Beyond the Two-Trials Rule

Arxiv

0+阅读 · 2023年7月10日

DADO -- Low-Cost Selection Strategies for Deep Active Design Optimization

Arxiv

0+阅读 · 2023年7月10日

Improved error estimate for the order of strong convergence of the Euler method for random ordinary differential equations

Arxiv

0+阅读 · 2023年7月10日

Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials

Arxiv

0+阅读 · 2023年7月10日

The Bayan Algorithm: Detecting Communities in Networks Through Exact and Approximate Optimization of Modularity

Arxiv

0+阅读 · 2023年7月8日

Serial and parallel kernelization of Multiple Hitting Set parameterized by the Dilworth number, implemented on the GPU

Arxiv

0+阅读 · 2023年7月8日

What Can Algebraic Topology and Differential Geometry Teach Us About Intrinsic Dynamics and Global Behavior of Robots?

Arxiv

0+阅读 · 2023年7月6日

Scaling Package Queries to a Billion Tuples via Hierarchical Partitioning and Customized Optimization

Arxiv

0+阅读 · 2023年7月6日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

VIP会员

文章信息

相关主题

最新内容

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

4+阅读 · 8月2日

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

1+阅读 · 8月2日

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

5+阅读 · 8月2日

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

6+阅读 · 8月2日

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

2+阅读 · 8月2日

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

6+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

4+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

4+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

4+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

2+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

12+阅读 · 7月31日

《美战争部指令文件：网络空间效应与使能能力测试评估》

《美战争部指令文件：网络空间效应与使能能力测试评估》

专知会员服务

9+阅读 · 7月31日

《史诗怒火行动：多域前瞻评估》49页报告

《史诗怒火行动：多域前瞻评估》49页报告

专知会员服务

9+阅读 · 7月31日

《英国防部：未来空战系统数字化战略》33页

《英国防部：未来空战系统数字化战略》33页

专知会员服务

6+阅读 · 7月31日

《面向自主飞行网络的智能体人工智能架构》

《面向自主飞行网络的智能体人工智能架构》

专知会员服务

9+阅读 · 7月31日

相关VIP内容

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

专知会员服务

49+阅读 · 2022年2月19日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

从采集到决策：美军视角下的战术情报范式重构

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Beyond the Two-Trials Rule

Arxiv

0+阅读 · 2023年7月10日

DADO -- Low-Cost Selection Strategies for Deep Active Design Optimization

Arxiv

0+阅读 · 2023年7月10日

Improved error estimate for the order of strong convergence of the Euler method for random ordinary differential equations

Arxiv

0+阅读 · 2023年7月10日

Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials

Arxiv

0+阅读 · 2023年7月10日

The Bayan Algorithm: Detecting Communities in Networks Through Exact and Approximate Optimization of Modularity

Arxiv

0+阅读 · 2023年7月8日

Serial and parallel kernelization of Multiple Hitting Set parameterized by the Dilworth number, implemented on the GPU

Arxiv

0+阅读 · 2023年7月8日

What Can Algebraic Topology and Differential Geometry Teach Us About Intrinsic Dynamics and Global Behavior of Robots?

Arxiv

0+阅读 · 2023年7月6日

Scaling Package Queries to a Billion Tuples via Hierarchical Partitioning and Customized Optimization

Arxiv

0+阅读 · 2023年7月6日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

相关基金

TREM2基因在晚发型AD（LOAD）中介导Aβ吞噬与炎症调节的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

CD2相关蛋白在阿尔茨海默病Tau蛋白介导的神经元损害中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

TR4翻译后修饰与宫内发育迟缓大鼠代谢综合征的易感机制

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

人巨细胞病毒潜伏感染的自噬调控及相关IE2-Akt-Beclin 1通路的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

有限长区域中的空间耦合多元Rateless码研究

国家自然科学基金

0+阅读 · 2012年12月31日

Kirchhoff型拟线性Schrodinger方程及其耦合系统的非光滑变分方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

视黄醛蛋白Leptosphaeria Rhodopsin中的质子跨膜传递机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员