Bayesian Fixed-Budget Best-Arm Identification - 专知论文

会员服务 ·

0

赌博机/老虎机 · 情景 · 优化器 · 可辨认的 · 频率主义学派 ·

2023 年 2 月 3 日

Bayesian Fixed-Budget Best-Arm Identification

翻译：贝叶斯固定预算最优臂识别

Alexia Atsidakou,Sumeet Katariya,Sujay Sanghavi,Branislav Kveton

Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations. In this work, we study this problem in the Bayesian setting. We propose a Bayesian elimination algorithm and derive an upper bound on its probability of misidentifying the optimal arm. The bound reflects the quality of the prior and is the first distribution-dependent bound in this setting. We prove it using a frequentist-like argument, where we carry the prior through, and then integrate out the bandit instance at the end. We also provide the first lower bound on the probability of misidentification in a $2$-armed Bayesian bandit and show that our upper bound (almost) matches the lower bound. Our experiments show that Bayesian elimination is superior to frequentist methods and competitive with the state-of-the-art Bayesian algorithms that have no guarantees in our setting.

翻译：固定预算最优臂识别（BAI）是一个多臂老虎机问题，其中智能体在固定观测预算内最大化识别最优臂的概率。本文在贝叶斯框架下研究该问题。我们提出一种贝叶斯淘汰算法，并推导出其误识别最优臂概率的上界。该上界反映了先验信息的质量，是该设定下首个依赖于分布的界。我们采用类似频率学派的论证方法进行证明——在证明过程中保留先验分布，最后再对老虎机实例进行积分。同时，我们给出了2臂贝叶斯老虎机中误识别概率的首个下界，并表明所提出的上界与该下界（几乎）匹配。实验表明，贝叶斯淘汰算法优于频率学派方法，且可与在该设定下无理论保证的最先进贝叶斯算法相媲美。

0

相关内容

赌博机/老虎机

赌博机/老虎机

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

受Mittag-Lef？er噪声激励的广义朗之万方程的随机共振研究

国家自然科学基金

0+阅读 · 2015年12月31日

多因素不确定情况下路面最优养护维修策略决策方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

III族氮化物半导体异质结构中的自旋轨道耦合效应

国家自然科学基金

0+阅读 · 2013年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

函数空间与度量测度空间上的分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于Junction tree推理的多运动平台分散式协同导航算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

结晶型超支化聚醚醚酮的分子设计、合成及其作为粘度调节剂的研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛭石改性乳酸基共聚物机理的研究

国家自然科学基金

0+阅读 · 2010年12月31日

Simplification and Improvement of MMS Approximation

Arxiv

0+阅读 · 2023年3月29日

Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Arxiv

0+阅读 · 2023年3月28日

Efficient Alternating Minimization Solvers for Wyner Multi-View Unsupervised Learning

Arxiv

0+阅读 · 2023年3月28日

List Online Classification

Arxiv

0+阅读 · 2023年3月27日

Hypothesis testing and confidence sets: why Bayesian not frequentist, and how to set a prior with a regulatory authority

Arxiv

0+阅读 · 2023年3月27日

On the Convergence of Numerical Integration as a Finite Matrix Approximation to Multiplication Operator

Arxiv

0+阅读 · 2023年3月27日

Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers

Arxiv

0+阅读 · 2023年3月25日

Sequential Knockoffs for Variable Selection in Reinforcement Learning

Arxiv

0+阅读 · 2023年3月24日

Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data

Arxiv

0+阅读 · 2023年3月24日

High Dimensional Generalised Penalised Least Squares

Arxiv

0+阅读 · 2023年3月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

频率主义学派

最新内容

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

3+阅读 · 今天8:04

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

3+阅读 · 今天7:59

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

3+阅读 · 今天7:56

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

3+阅读 · 今天7:50

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

3+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

4+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

12+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

7+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

6+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

4+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

10+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

6+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

9+阅读 · 7月26日

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

专知会员服务

8+阅读 · 7月26日

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

专知会员服务

11+阅读 · 7月26日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《异构人类团队的协作决策过程混合建模研究》

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

面向国防作战的最佳自主与蜂群无人机技术

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Simplification and Improvement of MMS Approximation

Arxiv

0+阅读 · 2023年3月29日

Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Arxiv

0+阅读 · 2023年3月28日

Efficient Alternating Minimization Solvers for Wyner Multi-View Unsupervised Learning

Arxiv

0+阅读 · 2023年3月28日

List Online Classification

Arxiv

0+阅读 · 2023年3月27日

Hypothesis testing and confidence sets: why Bayesian not frequentist, and how to set a prior with a regulatory authority

Arxiv

0+阅读 · 2023年3月27日

On the Convergence of Numerical Integration as a Finite Matrix Approximation to Multiplication Operator

Arxiv

0+阅读 · 2023年3月27日

Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers

Arxiv

0+阅读 · 2023年3月25日

Sequential Knockoffs for Variable Selection in Reinforcement Learning

Arxiv

0+阅读 · 2023年3月24日

Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data

Arxiv

0+阅读 · 2023年3月24日

High Dimensional Generalised Penalised Least Squares

Arxiv

0+阅读 · 2023年3月24日

相关基金

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

受Mittag-Lef？er噪声激励的广义朗之万方程的随机共振研究

国家自然科学基金

0+阅读 · 2015年12月31日

多因素不确定情况下路面最优养护维修策略决策方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

III族氮化物半导体异质结构中的自旋轨道耦合效应

国家自然科学基金

0+阅读 · 2013年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

函数空间与度量测度空间上的分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于Junction tree推理的多运动平台分散式协同导航算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

结晶型超支化聚醚醚酮的分子设计、合成及其作为粘度调节剂的研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛭石改性乳酸基共聚物机理的研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员