Leveraging the Triple Exponential Moving Average for Fast-Adaptive Moment Estimation - 专知论文

会员服务 ·

0

优化器 · 移动平均 · 矩 · 估计/估计量 · Networking ·

2023 年 6 月 2 日

Leveraging the Triple Exponential Moving Average for Fast-Adaptive Moment Estimation

翻译：利用三重指数移动平均实现快速自适应矩估计

Roi Peleg,Roi Weiss,Assaf Hoogi

Network optimization is a crucial step in the field of deep learning, as it directly affects the performance of models in various domains such as computer vision. Despite the numerous optimizers that have been developed over the years, the current methods are still limited in their ability to accurately and quickly identify gradient trends, which can lead to sub-optimal network performance. In this paper, we propose a novel deep optimizer called Fast-Adaptive Moment Estimation (FAME), which for the first time estimates gradient moments using a Triple Exponential Moving Average (TEMA). Incorporating TEMA into the optimization process provides richer and more accurate information on data changes and trends, as compared to the standard Exponential Moving Average used in essentially all current leading adaptive optimization methods. Our proposed FAME optimizer has been extensively validated through a wide range of benchmarks, including CIFAR-10, CIFAR-100, PASCAL-VOC, MS-COCO, and Cityscapes, using 14 different learning architectures, six optimizers, and various vision tasks, including detection, classification and semantic understanding. The results demonstrate that our FAME optimizer outperforms other leading optimizers in terms of both robustness and accuracy.

翻译：网络优化是深度学习领域的关键步骤，直接影响模型在计算机视觉等多个领域的性能。尽管多年来已开发出众多优化器，但现有方法在准确快速识别梯度趋势方面仍存在局限性，这可能导致网络性能欠优。本文提出一种新型深度优化器——快速自适应矩估计（FAME），首次采用三重指数移动平均（TEMA）估计梯度矩。相较于当前主流自适应优化方法中普遍使用的标准指数移动平均，将TEMA引入优化过程可提供更丰富、更准确的数据变化与趋势信息。我们提出的FAME优化器已通过CIFAR-10、CIFAR-100、PASCAL-VOC、MS-COCO和Cityscapes等广泛基准测试，使用14种不同学习架构、六种优化器以及检测、分类和语义理解等多种视觉任务进行了全面验证。结果表明，我们的FAME优化器在鲁棒性和准确性方面均优于其他主流优化器。

0

相关内容

优化器

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

基于特征骨架质谱定位法快速发现海绵中aaptamine生物碱类抗肿瘤先导化合物

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

二亚硝基哌嗪（DNP）介导Clusterin表达参与鼻咽癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于双CCD视觉测量系统的多目标、运动光纤位置检测研究

国家自然科学基金

0+阅读 · 2013年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于光合作用机制的纳晶光电转换复合材料的制备及性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

慢性间断低氧对家兔颏舌肌运动皮质区调控上气道扩张肌的影响及作用机制

国家自然科学基金

0+阅读 · 2008年12月31日

食管癌转移高风险性相关的SNP位点筛查研究

国家自然科学基金

0+阅读 · 2008年12月31日

AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation

Arxiv

0+阅读 · 2023年7月24日

Bayesian taut splines for estimating the number of modes

Arxiv

0+阅读 · 2023年7月21日

Adaptive interventions for both accuracy and time in AI-assisted human decision making

Arxiv

0+阅读 · 2023年7月20日

A Hybrid Adaptive Controller for Soft Robot Interchangeability

Arxiv

0+阅读 · 2023年7月20日

Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality

Arxiv

0+阅读 · 2023年7月20日

Differences Between Hard and Noisy-labeled Samples: An Empirical Study

Arxiv

0+阅读 · 2023年7月20日

Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

Arxiv

0+阅读 · 2023年7月20日

The Geometric Median and Applications to Robust Mean Estimation

Arxiv

0+阅读 · 2023年7月19日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

67+阅读 · 2019年9月8日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

0+阅读 · 34分钟前

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

1+阅读 · 51分钟前

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

1+阅读 · 54分钟前

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

1+阅读 · 56分钟前

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

1+阅读 · 今天13:13

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

专知会员服务

0+阅读 · 今天13:10

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

5+阅读 · 6月16日

多模态代码智能综述：从视觉输入到可执行代码系统

多模态代码智能综述：从视觉输入到可执行代码系统

专知会员服务

7+阅读 · 6月16日

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

5+阅读 · 6月16日

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

5+阅读 · 6月16日

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

15+阅读 · 6月16日

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

6+阅读 · 6月16日

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

10+阅读 · 6月16日

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

21+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

8+阅读 · 6月15日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从燃煤战舰到算法战争：水面指挥的永恒要求

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

定向能反无人机系统最新发展动态

《短程弹道再入飞行器拦截时间中的一项异常现象》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation

Arxiv

0+阅读 · 2023年7月24日

Bayesian taut splines for estimating the number of modes

Arxiv

0+阅读 · 2023年7月21日

Adaptive interventions for both accuracy and time in AI-assisted human decision making

Arxiv

0+阅读 · 2023年7月20日

A Hybrid Adaptive Controller for Soft Robot Interchangeability

Arxiv

0+阅读 · 2023年7月20日

Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality

Arxiv

0+阅读 · 2023年7月20日

Differences Between Hard and Noisy-labeled Samples: An Empirical Study

Arxiv

0+阅读 · 2023年7月20日

Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

Arxiv

0+阅读 · 2023年7月20日

The Geometric Median and Applications to Robust Mean Estimation

Arxiv

0+阅读 · 2023年7月19日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

67+阅读 · 2019年9月8日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

基于特征骨架质谱定位法快速发现海绵中aaptamine生物碱类抗肿瘤先导化合物

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

二亚硝基哌嗪（DNP）介导Clusterin表达参与鼻咽癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于双CCD视觉测量系统的多目标、运动光纤位置检测研究

国家自然科学基金

0+阅读 · 2013年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于光合作用机制的纳晶光电转换复合材料的制备及性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

慢性间断低氧对家兔颏舌肌运动皮质区调控上气道扩张肌的影响及作用机制

国家自然科学基金

0+阅读 · 2008年12月31日

食管癌转移高风险性相关的SNP位点筛查研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员