Gradient Correction beyond Gradient Descent - 专知论文

会员服务 ·

0

通用动力公司 · Networking · Neural Networks · 可约的 · 隐状态 ·

2023 年 5 月 26 日

Gradient Correction beyond Gradient Descent

翻译：超越梯度下降的梯度校正

Zefan Li,Bingbing Ni,Teng Li,WenJun Zhang,Wen Gao

from arxiv, There are errors in the description of GC-W module and GC-ODE Section 3.2 and Section 3.3, which may mislead the readers. e.g., 1. the structure of GC-W module is not described correctly. 2. the GC-ODE module is not described clearly. Therefore we want to withdrawal this paper for a thorough correction

The great success neural networks have achieved is inseparable from the application of gradient-descent (GD) algorithms. Based on GD, many variant algorithms have emerged to improve the GD optimization process. The gradient for back-propagation is apparently the most crucial aspect for the training of a neural network. The quality of the calculated gradient can be affected by multiple aspects, e.g., noisy data, calculation error, algorithm limitation, and so on. To reveal gradient information beyond gradient descent, we introduce a framework (\textbf{GCGD}) to perform gradient correction. GCGD consists of two plug-in modules: 1) inspired by the idea of gradient prediction, we propose a \textbf{GC-W} module for weight gradient correction; 2) based on Neural ODE, we propose a \textbf{GC-ODE} module for hidden states gradient correction. Experiment results show that our gradient correction framework can effectively improve the gradient quality to reduce training epochs by $\sim$ 20\% and also improve the network performance.

翻译：神经网络取得的巨大成功离不开梯度下降（GD）算法的应用。基于GD，已涌现出许多变体算法来改进GD优化过程。反向传播的梯度显然是神经网络训练中最关键的环节。计算梯度的质量可能受到多个方面的影响，例如噪声数据、计算误差、算法限制等。为了揭示梯度下降之外的梯度信息，我们引入了一个框架（\textbf{GCGD}）来进行梯度校正。GCGD包含两个即插即用模块：1）受梯度预测思想的启发，我们提出了\textbf{GC-W}模块用于权重梯度校正；2）基于神经常微分方程（Neural ODE），我们提出了\textbf{GC-ODE}模块用于隐藏状态梯度校正。实验结果表明，我们的梯度校正框架能够有效提升梯度质量，从而减少约20%的训练轮次，并同时提升网络性能。

0

相关内容

通用动力公司

通用动力公司

通用动力公司（General Dynamics）是一家美国的国防企业集团。2008年时通用动力是世界第五大国防工业承包商。由于近年来不断的扩充和并购其他公司，通用动力现今的组成与面貌已与冷战时期时大不相同。现今通用动力包含三大业务集团：海洋、作战系统和资讯科技集团。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

分子光开关用于嵌段共聚物自组装纳米结构的超分辨荧光成像

国家自然科学基金

0+阅读 · 2014年12月31日

CuS/NaYF4:Yb, Er/SiO2复合纳米胶囊及肿瘤荧光成像诊断和光热消融治疗性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向降冰片烯共聚物在癌细胞荧光成像中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

烧伤脓毒症时TLR4-MyD88-NF-κB和Insulin-PI3K-Akt通路的互调机制

国家自然科学基金

0+阅读 · 2012年12月31日

用于有机薄膜晶体管绝缘层材料的高介电常数聚合物的研究

国家自然科学基金

0+阅读 · 2012年12月31日

SphK1通路对二次打击脓毒症模型淋巴细胞凋亡的影响及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

炎症微环境在高血压心脏纤维化发病机制中的研究

国家自然科学基金

0+阅读 · 2011年12月31日

铁蛋白与USPIO在磁共振活体细胞示踪中的协同作用

国家自然科学基金

0+阅读 · 2009年12月31日

Provably Faster Gradient Descent via Long Steps

Arxiv

0+阅读 · 2023年7月17日

A subgradient method with constant step-size for $\ell_1$-composite optimization

Arxiv

0+阅读 · 2023年7月17日

Gauss-Southwell type descent methods for low-rank matrix optimization

Arxiv

0+阅读 · 2023年7月16日

Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning

Arxiv

1+阅读 · 2023年7月16日

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Arxiv

0+阅读 · 2023年7月14日

Differentially Private Stochastic Gradient Descent with Low-Noise

Arxiv

0+阅读 · 2023年7月14日

Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality

Arxiv

0+阅读 · 2023年7月13日

Learning IMM Filter Parameters from Measurements using Gradient Descent

Arxiv

0+阅读 · 2023年7月13日

On the Generalization Mystery in Deep Learning

Arxiv

10+阅读 · 2022年3月18日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

通用动力公司

Neural Networks

最新内容

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

0+阅读 · 今天14:48

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

0+阅读 · 今天14:46

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

4+阅读 · 今天8:04

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

4+阅读 · 今天7:59

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

4+阅读 · 今天7:56

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

4+阅读 · 今天7:50

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

4+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

6+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

13+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

7+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

7+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

5+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

11+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

7+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

10+阅读 · 7月26日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

博士论文 | 从算法到基础模型：强化学习的统一视角

《异构人类团队的协作决策过程混合建模研究》

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

面向国防作战的最佳自主与蜂群无人机技术

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Provably Faster Gradient Descent via Long Steps

Arxiv

0+阅读 · 2023年7月17日

A subgradient method with constant step-size for $\ell_1$-composite optimization

Arxiv

0+阅读 · 2023年7月17日

Gauss-Southwell type descent methods for low-rank matrix optimization

Arxiv

0+阅读 · 2023年7月16日

Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning

Arxiv

1+阅读 · 2023年7月16日

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Arxiv

0+阅读 · 2023年7月14日

Differentially Private Stochastic Gradient Descent with Low-Noise

Arxiv

0+阅读 · 2023年7月14日

Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality

Arxiv

0+阅读 · 2023年7月13日

Learning IMM Filter Parameters from Measurements using Gradient Descent

Arxiv

0+阅读 · 2023年7月13日

On the Generalization Mystery in Deep Learning

Arxiv

10+阅读 · 2022年3月18日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

分子光开关用于嵌段共聚物自组装纳米结构的超分辨荧光成像

国家自然科学基金

0+阅读 · 2014年12月31日

CuS/NaYF4:Yb, Er/SiO2复合纳米胶囊及肿瘤荧光成像诊断和光热消融治疗性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向降冰片烯共聚物在癌细胞荧光成像中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

烧伤脓毒症时TLR4-MyD88-NF-κB和Insulin-PI3K-Akt通路的互调机制

国家自然科学基金

0+阅读 · 2012年12月31日

用于有机薄膜晶体管绝缘层材料的高介电常数聚合物的研究

国家自然科学基金

0+阅读 · 2012年12月31日

SphK1通路对二次打击脓毒症模型淋巴细胞凋亡的影响及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

炎症微环境在高血压心脏纤维化发病机制中的研究

国家自然科学基金

0+阅读 · 2011年12月31日

铁蛋白与USPIO在磁共振活体细胞示踪中的协同作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员