Incentivizing Exploration with Linear Contexts and Combinatorial Actions - 专知论文

会员服务 ·

0

线性的 · 相互独立的 · 样本 · 赌博机/老虎机 · Bandits ·

2023 年 6 月 3 日

Incentivizing Exploration with Linear Contexts and Combinatorial Actions

翻译：基于线性上下文与组合行为激励探索

from arxiv, International Conference on Machine Learning (ICML) 2023

We advance the study of incentivized bandit exploration, in which arm choices are viewed as recommendations and are required to be Bayesian incentive compatible. Recent work has shown under certain independence assumptions that after collecting enough initial samples, the popular Thompson sampling algorithm becomes incentive compatible. We give an analog of this result for linear bandits, where the independence of the prior is replaced by a natural convexity condition. This opens up the possibility of efficient and regret-optimal incentivized exploration in high-dimensional action spaces. In the semibandit model, we also improve the sample complexity for the pre-Thompson sampling phase of initial data collection.

翻译：我们推进了激励式赌博机探索研究，其中臂选择被视为推荐，并要求满足贝叶斯激励相容性。近期研究表明，在特定独立性假设下，收集足够初始样本后，流行的汤普森采样算法将成为激励相容的。我们给出了线性赌博机模型的类似结果，其中先验独立性被一种自然凸性条件所替代。这开启了在高维动作空间中进行高效且遗憾最优的激励式探索的可能性。在半赌博机模型中，我们还改进了初始数据收集阶段中前汤普森采样阶段的样本复杂度。

0

相关内容

线性的

干货书！基于单调算子的大规模凸优化，348页pdf

干货书！基于单调算子的大规模凸优化，348页pdf

专知会员服务

50+阅读 · 2022年7月24日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

碱土离子掺杂染料敏化太阳电池稳定性机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

集装箱多式联运服务组合拍卖机制设计与优化模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于企业间关系管理的供应链整合的实证研究

国家自然科学基金

0+阅读 · 2009年12月31日

嵌段共聚物自组装合成金属配合物-聚合物纳米杂化发光材料

国家自然科学基金

0+阅读 · 2009年12月31日

基于Petri网的构件组装正确性研究

国家自然科学基金

0+阅读 · 2008年12月31日

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

Arxiv

0+阅读 · 2023年7月25日

Social Optimum Equilibrium Selection for Distributed Multi-Agent Optimization

Arxiv

0+阅读 · 2023年7月25日

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

Arxiv

0+阅读 · 2023年7月24日

Anytime Model Selection in Linear Bandits

Arxiv

0+阅读 · 2023年7月24日

Nonparametric Linear Feature Learning in Regression Through Regularisation

Arxiv

0+阅读 · 2023年7月24日

PFNs4BO: In-Context Learning for Bayesian Optimization

Arxiv

0+阅读 · 2023年7月22日

Survey Design and Estimating Equations when Combining Big Data with Probability Samples

Arxiv

0+阅读 · 2023年7月22日

Bandits with Deterministically Evolving States

Arxiv

0+阅读 · 2023年7月21日

Selective inference for clustering with unknown variance

Selective inference for clustering with unknown variance

Arxiv

0+阅读 · 2023年7月21日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

VIP会员

文章信息

相关主题

相互独立的

赌博机/老虎机

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

2+阅读 · 52分钟前

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

1+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

1+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

1+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

8+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

9+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

9+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

干货书！基于单调算子的大规模凸优化，348页pdf

干货书！基于单调算子的大规模凸优化，348页pdf

专知会员服务

50+阅读 · 2022年7月24日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

无人机自主控制与人工智能：系统性综述

《打造“黄金舰队”》57页报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

相关论文

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

Arxiv

0+阅读 · 2023年7月25日

Social Optimum Equilibrium Selection for Distributed Multi-Agent Optimization

Arxiv

0+阅读 · 2023年7月25日

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

Arxiv

0+阅读 · 2023年7月24日

Anytime Model Selection in Linear Bandits

Arxiv

0+阅读 · 2023年7月24日

Nonparametric Linear Feature Learning in Regression Through Regularisation

Arxiv

0+阅读 · 2023年7月24日

PFNs4BO: In-Context Learning for Bayesian Optimization

Arxiv

0+阅读 · 2023年7月22日

Survey Design and Estimating Equations when Combining Big Data with Probability Samples

Arxiv

0+阅读 · 2023年7月22日

Bandits with Deterministically Evolving States

Arxiv

0+阅读 · 2023年7月21日

Selective inference for clustering with unknown variance

Selective inference for clustering with unknown variance

Arxiv

0+阅读 · 2023年7月21日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

相关基金

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

碱土离子掺杂染料敏化太阳电池稳定性机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

集装箱多式联运服务组合拍卖机制设计与优化模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于企业间关系管理的供应链整合的实证研究

国家自然科学基金

0+阅读 · 2009年12月31日

嵌段共聚物自组装合成金属配合物-聚合物纳米杂化发光材料

国家自然科学基金

0+阅读 · 2009年12月31日

基于Petri网的构件组装正确性研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员