AutoRL Hyperparameter Landscapes - 专知论文

会员服务 ·

0

超参数 · 优化器 · Automator · SAC · HTTPS ·

2023 年 6 月 5 日

AutoRL Hyperparameter Landscapes

翻译：AutoRL超参数景观

Aditya Mohan,Carolin Benjamins,Konrad Wienecke,Alexander Dockhorn,Marius Lindauer

from arxiv, Version updated after acceptance

Although Reinforcement Learning (RL) has shown to be capable of producing impressive results, its use is limited by the impact of its hyperparameters on performance. This often makes it difficult to achieve good results in practice. Automated RL (AutoRL) addresses this difficulty, yet little is known about the dynamics of the hyperparameter landscapes that hyperparameter optimization (HPO) methods traverse in search of optimal configurations. In view of existing AutoRL approaches dynamically adjusting hyperparameter configurations, we propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training. Addressing an important open question on the legitimacy of such dynamic AutoRL approaches, we provide thorough empirical evidence that the hyperparameter landscapes strongly vary over time across representative algorithms from RL literature (DQN, PPO, and SAC) in different kinds of environments (Cartpole, Bipedal Walker, and Hopper) This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses. Our code can be found at https://github.com/automl/AutoRL-Landscape

翻译：尽管强化学习（RL）已展现出产生显著结果的能力，但其应用受限于超参数对性能的影响，这往往使其在实践中难以获得良好效果。自动强化学习（AutoRL）旨在解决这一难题，然而对于超参数优化（HPO）方法在寻求最优配置时所穿越的超参数景观的动态特性，目前仍知之甚少。鉴于现有AutoRL方法会动态调整超参数配置，本文提出了一种构建与分析超参数景观的方法，该分析不仅针对单一时间点，而是贯穿训练过程中的多个时间点。针对这类动态AutoRL方法有效性的关键开放问题，我们提供了充分的实证证据，证明在RL文献中的代表性算法（DQN、PPO与SAC）以及不同类型环境（Cartpole、Bipedal Walker和Hopper）中，超参数景观会随训练时间发生显著变化。这一发现支持了训练过程中应动态调整超参数的理论，并展示了通过景观分析获取AutoRL问题更深入洞察的潜力。我们的代码可从https://github.com/automl/AutoRL-Landscape获取。

0

相关内容

超参数

在贝叶斯统计中，超参数是先验分布的参数；该术语用于将它们与所分析的基础系统的模型参数区分开。

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

专知会员服务

79+阅读 · 2022年12月11日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

发射可调铂(II)配合物的设计和新型静电喷雾沉积电致发光器件的制备

国家自然科学基金

0+阅读 · 2015年12月31日

三氧化二砷在TBLR1-RARα阳性急性早幼粒细胞白血病分化和凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

靶向EphB4的放射性分子探针在体诊治评价

国家自然科学基金

0+阅读 · 2014年12月31日

应变诱导埋嵌型纳米颗粒结构相变及其相关性能的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

刺激响应性树枝形分子囊泡的构建、自组装及药物控制释放行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型高折射率环硫树脂的分子设计与合成研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚合物太阳能电池新型高效光伏材料的研究

国家自然科学基金

0+阅读 · 2012年12月31日

资助《数学进展》期刊

国家自然科学基金

3+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing

Arxiv

0+阅读 · 2023年7月26日

Local Problems on Grids from the Perspective of Distributed Algorithms, Finitary Factors, and Descriptive Combinatorics

Arxiv

0+阅读 · 2023年7月25日

marl-jax: Multi-Agent Reinforcement Leaning Framework

marl-jax: Multi-Agent Reinforcement Leaning Framework

Arxiv

0+阅读 · 2023年7月25日

Submodular Reinforcement Learning

Arxiv

0+阅读 · 2023年7月25日

A Model Predictive Capture Point Control Framework for Robust Humanoid Balancing via Ankle, Hip, and Stepping Strategies

Arxiv

0+阅读 · 2023年7月25日

Counterfactual Explanation Policies in RL

Arxiv

0+阅读 · 2023年7月25日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

A Survey on Automated Driving System Testing: Landscapes and Trends

Arxiv

12+阅读 · 2022年6月13日

Trust in Human-AI Interaction: Scoping Out Models, Measures, and Methods

Arxiv

22+阅读 · 2022年4月30日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

VIP会员

文章信息

相关主题

最新内容

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

5+阅读 · 今天8:00

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

4+阅读 · 今天7:44

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

4+阅读 · 今天7:28

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

4+阅读 · 今天7:18

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

5+阅读 · 今天7:07

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

4+阅读 · 今天7:03

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

4+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

5+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

10+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

4+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

5+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

8+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

7+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

4+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

6+阅读 · 6月22日

相关VIP内容

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

专知会员服务

79+阅读 · 2022年12月11日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

重新思考无人机时代的生存能力

在人工智能加速决策环境中拓展OODA循环

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

装甲突击旅：现代战争思考、战斗与组织

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing

Arxiv

0+阅读 · 2023年7月26日

Local Problems on Grids from the Perspective of Distributed Algorithms, Finitary Factors, and Descriptive Combinatorics

Arxiv

0+阅读 · 2023年7月25日

marl-jax: Multi-Agent Reinforcement Leaning Framework

marl-jax: Multi-Agent Reinforcement Leaning Framework

Arxiv

0+阅读 · 2023年7月25日

Submodular Reinforcement Learning

Arxiv

0+阅读 · 2023年7月25日

A Model Predictive Capture Point Control Framework for Robust Humanoid Balancing via Ankle, Hip, and Stepping Strategies

Arxiv

0+阅读 · 2023年7月25日

Counterfactual Explanation Policies in RL

Arxiv

0+阅读 · 2023年7月25日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

A Survey on Automated Driving System Testing: Landscapes and Trends

Arxiv

12+阅读 · 2022年6月13日

Trust in Human-AI Interaction: Scoping Out Models, Measures, and Methods

Arxiv

22+阅读 · 2022年4月30日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

相关基金

发射可调铂(II)配合物的设计和新型静电喷雾沉积电致发光器件的制备

国家自然科学基金

0+阅读 · 2015年12月31日

三氧化二砷在TBLR1-RARα阳性急性早幼粒细胞白血病分化和凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

靶向EphB4的放射性分子探针在体诊治评价

国家自然科学基金

0+阅读 · 2014年12月31日

应变诱导埋嵌型纳米颗粒结构相变及其相关性能的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

刺激响应性树枝形分子囊泡的构建、自组装及药物控制释放行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型高折射率环硫树脂的分子设计与合成研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚合物太阳能电池新型高效光伏材料的研究

国家自然科学基金

0+阅读 · 2012年12月31日

资助《数学进展》期刊

国家自然科学基金

3+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员