Cross-Entropy Estimators for Sequential Experiment Design with Reinforcement Learning - 专知论文

会员服务 ·

0

估计/估计量 · Learning · INFORMS · contrastive · 设计 ·

2023 年 5 月 29 日

Cross-Entropy Estimators for Sequential Experiment Design with Reinforcement Learning

翻译：基于强化学习的序贯实验设计的交叉熵估计器

Tom Blau,Edwin Bonilla,Iadine Chades,Amir Dezfouli

Reinforcement learning can effectively learn amortised design policies for designing sequences of experiments. However, current methods rely on contrastive estimators of expected information gain, which require an exponential number of contrastive samples to achieve an unbiased estimation. We propose an alternative lower bound estimator, based on the cross-entropy of the joint model distribution and a flexible proposal distribution. This proposal distribution approximates the true posterior of the model parameters given the experimental history and the design policy. Our estimator requires no contrastive samples, can achieve more accurate estimates of high information gains, allows learning of superior design policies, and is compatible with implicit probabilistic models. We assess our algorithm's performance in various tasks, including continuous and discrete designs and explicit and implicit likelihoods.

翻译：强化学习能够有效学习用于设计实验序列的摊销设计策略。然而，当前方法依赖于期望信息增益的对比估计量，这需要指数数量的对比样本才能实现无偏估计。我们提出了一种基于联合模型分布与灵活提议分布交叉熵的替代下界估计量。该提议分布近似于给定实验历史与设计策略下模型参数的真实后验。我们的估计量无需对比样本，能够更精确地估计高信息增益，支持学习更优的设计策略，并且与隐式概率模型兼容。我们在各类任务中评估了算法性能，包括连续与离散设计、显式与隐式似然。

0

相关内容

估计/估计量

估计/估计量

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

129+阅读 · 2020年11月20日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

99+阅读 · 2019年12月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

247+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

LncRNA-HOTAIR介导酸性微环境下胰腺癌细胞侵袭转移的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

自噬在HIV-1gp120致海马神经元损伤过程中的作用及信号机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

缺氧肿瘤微环境在促进CD133+CXCR4-结肠癌细胞上皮间质转化中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

C1型尼曼-匹克氏症轴突发育异常的病理机制

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

温阳活血利水法对TGF-β-Smads信号通路介导糖尿病心肌重构的效应机制

国家自然科学基金

0+阅读 · 2012年12月31日

TRPP2-STIM1相互作用：脑缺血再灌注损伤新机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

Re合金化镍基单晶高温合金的强韧化机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

Hierarchically Composing Level Generators for the Creation of Complex Structures

Arxiv

0+阅读 · 2023年7月19日

Internally Rewarded Reinforcement Learning

Arxiv

0+阅读 · 2023年7月17日

Non-Stationary Policy Learning for Multi-Timescale Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年7月17日

Revisiting the Robustness of the Minimum Error Entropy Criterion: A Transfer Learning Case Study

Arxiv

0+阅读 · 2023年7月17日

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale

Arxiv

0+阅读 · 2023年7月16日

Two-Sample Test with Copula Entropy

Arxiv

0+阅读 · 2023年7月14日

Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年7月14日

Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning

Arxiv

0+阅读 · 2023年7月13日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

0+阅读 · 今天15:20

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

0+阅读 · 今天15:18

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

8+阅读 · 今天5:53

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

4+阅读 · 今天5:45

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

2+阅读 · 今天5:23

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

2+阅读 · 今天5:11

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

6+阅读 · 今天5:04

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

4+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

8+阅读 · 7月26日

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

专知会员服务

8+阅读 · 7月26日

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

专知会员服务

10+阅读 · 7月26日

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

专知会员服务

8+阅读 · 7月26日

《反无人机交战场景下的战斗归零研究》

《反无人机交战场景下的战斗归零研究》

专知会员服务

7+阅读 · 7月26日

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

专知会员服务

4+阅读 · 7月26日

博士论文 | 用代码结构感知方法推进代码大模型

博士论文 | 用代码结构感知方法推进代码大模型

专知会员服务

5+阅读 · 7月25日

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

129+阅读 · 2020年11月20日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

99+阅读 · 2019年12月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

247+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

美空军新型反无人机部队初探

博士论文 | 面向大模型推理的内存高效算法

《无人系统互操作性导论——无人系统联合架构（JAUS）》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Hierarchically Composing Level Generators for the Creation of Complex Structures

Arxiv

0+阅读 · 2023年7月19日

Internally Rewarded Reinforcement Learning

Arxiv

0+阅读 · 2023年7月17日

Non-Stationary Policy Learning for Multi-Timescale Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年7月17日

Revisiting the Robustness of the Minimum Error Entropy Criterion: A Transfer Learning Case Study

Arxiv

0+阅读 · 2023年7月17日

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale

Arxiv

0+阅读 · 2023年7月16日

Two-Sample Test with Copula Entropy

Arxiv

0+阅读 · 2023年7月14日

Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年7月14日

Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning

Arxiv

0+阅读 · 2023年7月13日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

相关基金

LncRNA-HOTAIR介导酸性微环境下胰腺癌细胞侵袭转移的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

自噬在HIV-1gp120致海马神经元损伤过程中的作用及信号机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

缺氧肿瘤微环境在促进CD133+CXCR4-结肠癌细胞上皮间质转化中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

C1型尼曼-匹克氏症轴突发育异常的病理机制

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

温阳活血利水法对TGF-β-Smads信号通路介导糖尿病心肌重构的效应机制

国家自然科学基金

0+阅读 · 2012年12月31日

TRPP2-STIM1相互作用：脑缺血再灌注损伤新机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

Re合金化镍基单晶高温合金的强韧化机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员