CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies - 专知论文

会员服务 ·

0

优化器 · 回合 · 控制器 · Learning · 样本 ·

2023 年 2 月 15 日

CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

翻译：CAMEO：好奇心增强的Metropolis算法用于探索最优策略

Simo Alami. C,Fernando Llorente,Rim Kaddah,Luca Martino,Jesse Read

from arxiv, 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 1482-1486

Reinforcement Learning has drawn huge interest as a tool for solving optimal control problems. Solving a given problem (task or environment) involves converging towards an optimal policy. However, there might exist multiple optimal policies that can dramatically differ in their behaviour; for example, some may be faster than the others but at the expense of greater risk. We consider and study a distribution of optimal policies. We design a curiosity-augmented Metropolis algorithm (CAMEO), such that we can sample optimal policies, and such that these policies effectively adopt diverse behaviours, since this implies greater coverage of the different possible optimal policies. In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems, and even in the challenging case of environments that provide sparse rewards. We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability, and represents a first step towards learning the distribution of optimal policies itself.

翻译：强化学习作为解决最优控制问题的工具已引起广泛关注。解决特定问题（任务或环境）涉及收敛到最优策略。然而，可能存在多个行为迥异的最优策略；例如，有些策略可能更快，但以承担更大风险为代价。我们考虑并研究最优策略的分布。我们设计了一种好奇心增强的Metropolis算法（CAMEO），使得我们能够采样最优策略，并且这些策略能有效呈现多样化行为，因为这有助于更全面地覆盖不同的可能最优策略。在实验模拟中，我们展示CAMEO确实获得了能够解决经典控制问题的策略，甚至在提供稀疏奖励的具有挑战性的环境中也是如此。我们进一步证明，采样的不同策略展现出不同的风险特征，这对应了解释性方面的有趣实际应用，并代表了向学习最优策略分布本身迈出的第一步。

0

相关内容

优化器

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

专知

15+阅读 · 2018年6月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

等离子体中分数阶微分方程求解的有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基因拷贝数变异在自身免疫性甲状腺疾病发病中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

空间分数阶Schr？dinger方程的时间分裂谱方法

国家自然科学基金

0+阅读 · 2014年12月31日

Cofilin在Erucin诱导的乳腺癌细胞线粒体分裂和细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

运动和高温调控胰岛素抵抗大鼠Irisin代谢通路的研究

国家自然科学基金

0+阅读 · 2012年12月31日

超高速碰撞产生等离子体的电磁特性及物理机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Complexity analysis of Bayesian learning of high-dimensional DAG models and their equivalence classes

Arxiv

0+阅读 · 2023年4月6日

Batch mode active learning for efficient parameter estimation

Arxiv

0+阅读 · 2023年4月5日

A dynamic Bayesian optimized active recommender system for curiosity-driven Human-in-the-loop automated experiments

Arxiv

0+阅读 · 2023年4月5日

Independence weights for causal inference with continuous exposures

Arxiv

0+阅读 · 2023年4月5日

On Complexity of 1-Center in Various Metrics

Arxiv

0+阅读 · 2023年4月4日

Treatment Allocation with Strategic Agents

Arxiv

0+阅读 · 2023年4月4日

A Model for Multi-Agent Heterogeneous Interaction Problems

Arxiv

0+阅读 · 2023年4月4日

User-Conditioned Neural Control Policies for Mobile Robotics

Arxiv

0+阅读 · 2023年4月2日

Federated Ensemble Model-based Reinforcement Learning in Edge Computing

Arxiv

0+阅读 · 2023年4月1日

The Role of Heterogeneity in Autonomous Perimeter Defense Problems

The Role of Heterogeneity in Autonomous Perimeter Defense Problems

Arxiv

13+阅读 · 2022年2月21日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

1+阅读 · 今天14:45

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

1+阅读 · 今天14:43

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

3+阅读 · 今天14:31

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

3+阅读 · 今天14:20

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

2+阅读 · 今天14:11

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

3+阅读 · 今天14:07

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

3+阅读 · 今天14:03

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

2+阅读 · 今天13:59

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

5+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

8+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

7+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

5+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

8+阅读 · 6月22日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 世界动作模型：少做梦，多行动

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

美以伊冲突：无人机与人工智能的运用

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

专知

15+阅读 · 2018年6月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Complexity analysis of Bayesian learning of high-dimensional DAG models and their equivalence classes

Arxiv

0+阅读 · 2023年4月6日

Batch mode active learning for efficient parameter estimation

Arxiv

0+阅读 · 2023年4月5日

A dynamic Bayesian optimized active recommender system for curiosity-driven Human-in-the-loop automated experiments

Arxiv

0+阅读 · 2023年4月5日

Independence weights for causal inference with continuous exposures

Arxiv

0+阅读 · 2023年4月5日

On Complexity of 1-Center in Various Metrics

Arxiv

0+阅读 · 2023年4月4日

Treatment Allocation with Strategic Agents

Arxiv

0+阅读 · 2023年4月4日

A Model for Multi-Agent Heterogeneous Interaction Problems

Arxiv

0+阅读 · 2023年4月4日

User-Conditioned Neural Control Policies for Mobile Robotics

Arxiv

0+阅读 · 2023年4月2日

Federated Ensemble Model-based Reinforcement Learning in Edge Computing

Arxiv

0+阅读 · 2023年4月1日

The Role of Heterogeneity in Autonomous Perimeter Defense Problems

The Role of Heterogeneity in Autonomous Perimeter Defense Problems

Arxiv

13+阅读 · 2022年2月21日

相关基金

等离子体中分数阶微分方程求解的有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基因拷贝数变异在自身免疫性甲状腺疾病发病中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

空间分数阶Schr？dinger方程的时间分裂谱方法

国家自然科学基金

0+阅读 · 2014年12月31日

Cofilin在Erucin诱导的乳腺癌细胞线粒体分裂和细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

运动和高温调控胰岛素抵抗大鼠Irisin代谢通路的研究

国家自然科学基金

0+阅读 · 2012年12月31日

超高速碰撞产生等离子体的电磁特性及物理机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员